Zobrazeno 1 - 10
of 118 071
pro vyhledávání: '"Computer Science - Computation and Language"'
Autor:
Wang, Weiyun, Chen, Zhe, Wang, Wenhai, Cao, Yue, Liu, Yangzhou, Gao, Zhangwei, Zhu, Jinguo, Zhu, Xizhou, Lu, Lewei, Qiao, Yu, Dai, Jifeng
Existing open-source multimodal large language models (MLLMs) generally follow a training process involving pre-training and supervised fine-tuning. However, these models suffer from distribution shifts, which limit their multimodal reasoning, partic
Externí odkaz:
http://arxiv.org/abs/2411.10442
Multimodal Large Language Models (MLLMs) are known to hallucinate, which limits their practical applications. Recent works have attempted to apply Direct Preference Optimization (DPO) to enhance the performance of MLLMs, but have shown inconsistent i
Externí odkaz:
http://arxiv.org/abs/2411.10436
Task-oriented dialogue systems rely on predefined conversation schemes (dialogue flows) often represented as directed acyclic graphs. These flows can be manually designed or automatically generated from previously recorded conversations. Due to varia
Externí odkaz:
http://arxiv.org/abs/2411.10416
Autor:
Chi, Jianfeng, Karn, Ujjwal, Zhan, Hongyuan, Smith, Eric, Rando, Javier, Zhang, Yiming, Plawiak, Kate, Coudert, Zacharie Delpierre, Upasani, Kartikeya, Pasupuleti, Mahesh
We introduce Llama Guard 3 Vision, a multimodal LLM-based safeguard for human-AI conversations that involves image understanding: it can be used to safeguard content for both multimodal LLM inputs (prompt classification) and outputs (response classif
Externí odkaz:
http://arxiv.org/abs/2411.10414
Sparse Autoencoders (SAEs) are a promising approach for extracting neural network representations by learning a sparse and overcomplete decomposition of the network's internal activations. However, SAEs are traditionally trained considering only acti
Externí odkaz:
http://arxiv.org/abs/2411.10397
Event Causality Identification (ECI) has become a crucial task in Natural Language Processing (NLP), aimed at automatically extracting causalities from textual data. In this survey, we systematically address the foundational principles, technical fra
Externí odkaz:
http://arxiv.org/abs/2411.10371
The recently released model, Claude 3.5 Computer Use, stands out as the first frontier AI model to offer computer use in public beta as a graphical user interface (GUI) agent. As an early beta, its capability in the real-world complex environment rem
Externí odkaz:
http://arxiv.org/abs/2411.10323
In recent years, text-to-image (T2I) generation models have made significant progress in generating high-quality images that align with text descriptions. However, these models also face the risk of unsafe generation, potentially producing harmful co
Externí odkaz:
http://arxiv.org/abs/2411.10329
Autor:
Alaeddini, Maliheh
Emotion detection is pivotal in human communication, as it significantly influences behavior, relationships, and decision-making processes. This study concentrates on text-based emotion detection by leveraging the GoEmotions dataset, which annotates
Externí odkaz:
http://arxiv.org/abs/2411.10328
Autor:
Uchendu, Adaku, Le, Thai
The surge of data available on the internet has led to the adoption of various computational methods to analyze and extract valuable insights from this wealth of information. Among these, the field of Machine Learning (ML) has thrived by leveraging d
Externí odkaz:
http://arxiv.org/abs/2411.10298