Zobrazeno 1 - 10
of 8 299
pro vyhledávání: '"Zeng, Wei"'
Large Language Models (LLMs) have shown considerable promise in code generation. However, the automation sector, especially in motion control, continues to rely heavily on manual programming due to the complexity of tasks and critical safety consider
Externí odkaz:
http://arxiv.org/abs/2410.15154
Data surveillance has become more covert and pervasive with AI algorithms, which can result in biased social classifications. Appearance offers intuitive identity signals, but what does it mean to let AI observe and speculate on them? We introduce AI
Externí odkaz:
http://arxiv.org/abs/2410.03786
NFTracer: Tracing NFT Impact Dynamics in Transaction-flow Substitutive Systems with Visual Analytics
Impact dynamics are crucial for estimating the growth patterns of NFT projects by tracking the diffusion and decay of their relative appeal among stakeholders. Machine learning methods for impact dynamics analysis are incomprehensible and rigid in te
Externí odkaz:
http://arxiv.org/abs/2409.15754
Emerging multimodal large language models (MLLMs) exhibit great potential for chart question answering (CQA). Recent efforts primarily focus on scaling up training datasets (i.e., charts, data tables, and question-answer (QA) pairs) through data coll
Externí odkaz:
http://arxiv.org/abs/2407.20174
Researching the specificity of TCR contributes to the development of immunotherapy and provides new opportunities and strategies for personalized cancer immunotherapy. Therefore, we established a TCR generative specificity detection framework consist
Externí odkaz:
http://arxiv.org/abs/2407.19349
Multi-modal embeddings form the foundation for vision-language models, such as CLIP embeddings, the most widely used text-image embeddings. However, these embeddings are vulnerable to subtle misalignment of cross-modal features, resulting in decrease
Externí odkaz:
http://arxiv.org/abs/2407.12315
Personalized text-to-image models allow users to generate varied styles of images (specified with a sentence) for an object (specified with a set of reference images). While remarkable results have been achieved using diffusion-based generation model
Externí odkaz:
http://arxiv.org/abs/2407.06642
Table reasoning transforms user requirements into corresponding answers according to the provided table, which is often integrated with natural language interfaces for lay users to explore tabular data effortlessly. Recent research exploits large lan
Externí odkaz:
http://arxiv.org/abs/2406.03753
This work tackles the problem of geo-localization with a new paradigm using a large vision-language model (LVLM) augmented with human inference knowledge. A primary challenge here is the scarcity of data for training the LVLM - existing street-view d
Externí odkaz:
http://arxiv.org/abs/2406.18572
Piano audio-to-score transcription (A2S) is an important yet underexplored task with extensive applications for music composition, practice, and analysis. However, existing end-to-end piano A2S systems faced difficulties in retrieving bar-level infor
Externí odkaz:
http://arxiv.org/abs/2405.13527