Zobrazeno 1 - 10
of 2 937
pro vyhledávání: '"He XiaoDong"'
Recent advancements in scaling up models have significantly improved performance in Automatic Speech Recognition (ASR) tasks. However, training large ASR models from scratch remains costly. To address this issue, we introduce UME, a novel method that
Externí odkaz:
http://arxiv.org/abs/2412.17507
The believable simulation of multi-user behavior is crucial for understanding complex social systems. Recently, large language models (LLMs)-based AI agents have made significant progress, enabling them to achieve human-like intelligence across vario
Externí odkaz:
http://arxiv.org/abs/2412.09237
Autor:
Zhang, Tianle, Li, Dongjiang, Li, Yihang, Zeng, Zecui, Zhao, Lin, Sun, Lei, Chen, Yue, Wei, Xuelong, Zhan, Yibing, Li, Lusong, He, Xiaodong
The advancements in embodied AI are increasingly enabling robots to tackle complex real-world tasks, such as household manipulation. However, the deployment of robots in these environments remains constrained by the lack of comprehensive bimanual-mob
Externí odkaz:
http://arxiv.org/abs/2405.18860
Autor:
Zhang, Tianle, Guan, Jiayi, Zhao, Lin, Li, Yihang, Li, Dongjiang, Zeng, Zecui, Sun, Lei, Chen, Yue, Wei, Xuelong, Li, Lusong, He, Xiaodong
Offline reinforcement learning (RL) aims to learn optimal policies from previously collected datasets. Recently, due to their powerful representational capabilities, diffusion models have shown significant potential as policy models for offline RL is
Externí odkaz:
http://arxiv.org/abs/2405.18729
The rapid advancement of large language models has revolutionized various applications but also raised crucial concerns about their potential to perpetuate biases and unfairness when deployed in social media contexts. Evaluating LLMs' potential biase
Externí odkaz:
http://arxiv.org/abs/2312.15478
Autor:
Mamat, Bahtiyar, Sheng, Cheng, He, Xiaodong, Hou, Jiayi, Xu, Peng, Wang, Kunpeng, Zhuang, Jun, Wei, Mingrui, Liu, Min, Wang, Jin, Zhan, Mingsheng
Rydberg atoms as versatile tools for quantum applications are extremely sensitive to electric fields. When utilizing these atoms, it becomes imperative to comprehensively characterize and mitigate any residual electric fields present in the environme
Externí odkaz:
http://arxiv.org/abs/2312.02597
Autor:
Yang, Yijun, Zhou, Tianyi, Li, Kanxue, Tao, Dapeng, Li, Lusong, Shen, Li, He, Xiaodong, Jiang, Jing, Shi, Yuhui
While large language models (LLMs) excel in a simulated world of texts, they struggle to interact with the more realistic world without perceptions of other modalities such as visual or audio signals. Although vision-language models (VLMs) integrate
Externí odkaz:
http://arxiv.org/abs/2311.16714
Multimodal emotion recognition (MER) aims to detect the emotional status of a given expression by combining the speech and text information. Intuitively, label information should be capable of helping the model locate the salient tokens/frames releva
Externí odkaz:
http://arxiv.org/abs/2309.02106
High-quality data is essential for conversational recommendation systems and serves as the cornerstone of the network architecture development and training strategy design. Existing works contribute heavy human efforts to manually labeling or designi
Externí odkaz:
http://arxiv.org/abs/2306.09631
Autor:
Fu, Li, Li, Siqi, Li, Qingtao, Li, Fangzhu, Deng, Liping, Fan, Lu, Chen, Meng, Wu, Youzheng, He, Xiaodong
Self-Supervised Learning (SSL) Automatic Speech Recognition (ASR) models have shown great promise over Supervised Learning (SL) ones in low-resource settings. However, the advantages of SSL are gradually weakened when the amount of labeled data incre
Externí odkaz:
http://arxiv.org/abs/2306.02541