Zobrazeno 1 - 10
of 1 969
pro vyhledávání: '"Xia, Rui"'
Autor:
Bai, Ye, Chen, Jingping, Chen, Jitong, Chen, Wei, Chen, Zhuo, Ding, Chen, Dong, Linhao, Dong, Qianqian, Du, Yujiao, Gao, Kepan, Gao, Lu, Guo, Yi, Han, Minglun, Han, Ting, Hu, Wenchao, Hu, Xinying, Hu, Yuxiang, Hua, Deyu, Huang, Lu, Huang, Mingkun, Huang, Youjia, Jin, Jishuo, Kong, Fanliu, Lan, Zongwei, Li, Tianyu, Li, Xiaoyang, Li, Zeyang, Lin, Zehua, Liu, Rui, Liu, Shouda, Lu, Lu, Lu, Yizhou, Ma, Jingting, Ma, Shengtao, Pei, Yulin, Shen, Chen, Tan, Tian, Tian, Xiaogang, Tu, Ming, Wang, Bo, Wang, Hao, Wang, Yuping, Wang, Yuxuan, Xia, Hanzhang, Xia, Rui, Xie, Shuangyi, Xu, Hongmin, Yang, Meng, Zhang, Bihong, Zhang, Jun, Zhang, Wanyi, Zhang, Yang, Zhang, Yawei, Zheng, Yijie, Zou, Ming
Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-e
Externí odkaz:
http://arxiv.org/abs/2407.04675
Grounded Multimodal Named Entity Recognition (GMNER) task aims to identify named entities, entity types and their corresponding visual regions. GMNER task exhibits two challenging attributes: 1) The tenuous correlation between images and text on soci
Externí odkaz:
http://arxiv.org/abs/2406.07268
Publikováno v:
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
The ability to understand emotions is an essential component of human-like artificial intelligence, as emotions greatly influence human cognition, decision making, and social interactions. In addition to emotion recognition in conversations, the task
Externí odkaz:
http://arxiv.org/abs/2405.13049
Model merging is to combine fine-tuned models derived from multiple domains, with the intent of enhancing the model's proficiency across various domains. The principal concern is the resolution of parameter conflicts. A substantial amount of existing
Externí odkaz:
http://arxiv.org/abs/2403.02799
Visual commonsense contains knowledge about object properties, relationships, and behaviors in visual data. Discovering visual commonsense can provide a more comprehensive and richer understanding of images, and enhance the reasoning and decision-mak
Externí odkaz:
http://arxiv.org/abs/2402.17213
High-quality, large-scale corpora are the cornerstone of building foundation models. In this work, we introduce \textsc{MathPile}, a diverse and high-quality math-centric corpus comprising about 9.5 billion tokens. Throughout its creation, we adhered
Externí odkaz:
http://arxiv.org/abs/2312.17120
Knowledge Base Question Answering (KBQA) aims to answer factoid questions based on knowledge bases. However, generating the most appropriate knowledge base query code based on Natural Language Questions (NLQ) poses a significant challenge in KBQA. In
Externí odkaz:
http://arxiv.org/abs/2311.02956
Large Language Models (LLMs), such as ChatGPT, have recently been applied to various NLP tasks due to its open-domain generation capabilities. However, there are two issues with applying LLMs to dialogue tasks. 1. During the dialogue process, users m
Externí odkaz:
http://arxiv.org/abs/2310.03293
We observe that current conversational language models often waver in their judgments when faced with follow-up questions, even if the original judgment was correct. This wavering presents a significant challenge for generating reliable responses and
Externí odkaz:
http://arxiv.org/abs/2310.02174
Over the last decades, ample achievements have been made on Structure from motion (SfM). However, the vast majority of them basically work in an offline manner, i.e., images are firstly captured and then fed together into a SfM pipeline for obtaining
Externí odkaz:
http://arxiv.org/abs/2309.11883