Zobrazeno 1 - 10
of 32 045
pro vyhledávání: '"Li Xu"'
Autor:
Li, Yadong, Sun, Haoze, Lin, Mingan, Li, Tianpeng, Dong, Guosheng, Zhang, Tao, Ding, Bowen, Song, Wei, Cheng, Zhenglin, Huo, Yuqi, Chen, Song, Li, Xu, Pan, Da, Zhang, Shusen, Wu, Xin, Liang, Zheng, Liu, Jun, Lu, Keer, Zhao, Yaqi, Shen, Yanjun, Yang, Fan, Yu, Kaicheng, Lin, Tao, Xu, Jianhua, Zhou, Zenan, Chen, Weipeng
The salient multimodal capabilities and interactive experience of GPT-4o highlight its critical role in practical applications, yet it lacks a high-performing open-source counterpart. In this paper, we introduce Baichuan-Omni, the first open-source 7
Externí odkaz:
http://arxiv.org/abs/2410.08565
Language-queried target sound extraction (TSE) aims to extract specific sounds from mixtures based on language queries. Traditional fully-supervised training schemes require extensively annotated parallel audio-text data, which are labor-intensive. W
Externí odkaz:
http://arxiv.org/abs/2409.09398
Speech restoration aims at restoring full-band speech with high quality and intelligibility, considering a diverse set of distortions. MaskSR is a recently proposed generative model for this task. As other models of its kind, MaskSR attains high qual
Externí odkaz:
http://arxiv.org/abs/2409.09357
Autor:
Ma, Zongyang, Zhang, Ziqi, Chen, Yuxin, Qi, Zhongang, Yuan, Chunfeng, Li, Bing, Luo, Yingmin, Li, Xu, Qi, Xiaojuan, Shan, Ying, Hu, Weiming
Understanding the content of events occurring in the video and their inherent temporal logic is crucial for video-text retrieval. However, web-crawled pre-training datasets often lack sufficient event information, and the widely adopted video-level c
Externí odkaz:
http://arxiv.org/abs/2407.07478
Speech restoration aims at restoring high quality speech in the presence of a diverse set of distortions. Although several deep learning paradigms have been studied for this task, the power of the recently emerging language models has not been fully
Externí odkaz:
http://arxiv.org/abs/2406.02092
The automated synthesis of high-quality 3D gestures from speech is of significant value in virtual humans and gaming. Previous methods focus on synthesizing gestures that are synchronized with speech rhythm, yet they frequently overlook the inclusion
Externí odkaz:
http://arxiv.org/abs/2405.13336
The violation of Lam-Tung relation in the high-$p_T^{\ell\ell}$ region of the Drell-Yan process at the LHC presents a long-standing discrepancy with the standard model prediction at $\mathcal{O}(\alpha_s^3)$ accuracy. In this Letter, we employed a mo
Externí odkaz:
http://arxiv.org/abs/2405.04069
Facial expression recognition (FER) is vital for human-computer interaction and emotion analysis, yet recognizing expressions in low-resolution images remains challenging. This paper introduces a practical method called Dynamic Resolution Guidance fo
Externí odkaz:
http://arxiv.org/abs/2404.06365
Autor:
Li, Xu, Sun, Ruiqi, Lv, Jiameng, Jia, Peng, Li, Nan, Wei, Chengliang, Hu, Zou, Er, Xinzhong, Chen, Yun, Ban, Zhang, Fang, Yuedong, Guo, Qi, Liu, Dezi, Li, Guoliang, Lin, Lin, Li, Ming, Li, Ran, Li, Xiaobo, Luo, Yu, Meng, Xianmin, Nie, Jundan, Qi, Zhaoxiang, Qiu, Yisheng, Shao, Li, Tian, Hao, Wang, Lei, Wang, Wei, Xian, Jingtian, Xu, Youhua, Zhang, Tianmeng, Zhang, Xin, Zhou, Zhimin
Strong gravitational lensing is a powerful tool for investigating dark matter and dark energy properties. With the advent of large-scale sky surveys, we can discover strong lensing systems on an unprecedented scale, which requires efficient tools to
Externí odkaz:
http://arxiv.org/abs/2404.01780
Autor:
Wang, Jian, Li, Xu, Ma, Xingyue, Chen, Lan, Liu, Jun-Ming, Duan, Chun-Gang, Íñiguez-González, Jorge, Wu, Di, Yang, Yurong
Sliding ferroelectricity is a unique type of polarity recently observed in a properly stacked van der Waals bilayer. However, electric-field control of sliding ferroelectricity is hard and could induce large coercive electric fields and serious leaka
Externí odkaz:
http://arxiv.org/abs/2403.06531