Zobrazeno 1 - 10
of 18 614
pro vyhledávání: '"Chang, Wen"'
Recent advances in Video Large Language Models (Video-LLMs) have demonstrated their great potential in general-purpose video understanding. To verify the significance of these models, a number of benchmarks have been proposed to diagnose their capabi
Externí odkaz:
http://arxiv.org/abs/2409.18111
Autor:
Min, Xiongkuo, Gao, Yixuan, Cao, Yuqin, Zhai, Guangtao, Zhang, Wenjun, Sun, Huifang, Chen, Chang Wen
Traditional in the wild image quality assessment (IQA) models are generally trained with the quality labels of mean opinion score (MOS), while missing the rich subjective quality information contained in the quality ratings, for example, the standard
Externí odkaz:
http://arxiv.org/abs/2409.05540
In this paper, we investigate the feasibility of leveraging large language models (LLMs) for integrating general knowledge and incorporating pseudo-events as priors for temporal content distribution in video moment retrieval (VMR) models. The motivat
Externí odkaz:
http://arxiv.org/abs/2407.15051
Layout generation is the keystone in achieving automated graphic design, requiring arranging the position and size of various multi-modal design elements in a visually pleasing and constraint-following manner. Previous approaches are either inefficie
Externí odkaz:
http://arxiv.org/abs/2406.02884
Autor:
Diehl, Stefan, Joo, Kyungseon, Semenov-Tian-Shansky, Kirill, Weiss, Christian, Braun, Vladimir, Chang, Wen-Chen, Chatagnon, Pierre, Constantinou, Martha, Guo, Yuxun, Hutauruk, Parada T. P., Jo, Hyon-Suk, Kim, Andrey, Kim, Jun-Young, Kroll, Peter, Kumano, Shunzo, Lee, Chang-Hwan, Liuti, Simonetta, McNulty, Ronan, Son, Hyeon-Dong, Sznajder, Pawel, Usman, Ali, Van Hulse, Charlotte, Vanderhaeghen, Marc, Winn, Michael
QCD gives rise to a rich spectrum of excited baryon states. Understanding their internal structure is important for many areas of nuclear physics, such as nuclear forces, dense matter, and neutrino-nucleus interactions. Generalized parton distributio
Externí odkaz:
http://arxiv.org/abs/2405.15386
Publikováno v:
Proceedings of the 24th International Society for Music Information Retrieval Conference, 174-181. Milan, Italy, November 5-9, 2023
Nowadays, humans are constantly exposed to music, whether through voluntary streaming services or incidental encounters during commercial breaks. Despite the abundance of music, certain pieces remain more memorable and often gain greater popularity.
Externí odkaz:
http://arxiv.org/abs/2405.12847
Injecting Salesperson's Dialogue Strategies in Large Language Models with Chain-of-Thought Reasoning
Autor:
Chang, Wen-Yu, Chen, Yun-Nung
Recent research in dialogue systems and corpora has focused on two main categories: task-oriented (TOD) and open-domain (chit-chat) dialogues. TOD systems help users accomplish specific tasks, while open-domain systems aim to create engaging conversa
Externí odkaz:
http://arxiv.org/abs/2404.18564
Autor:
Yu, Xiaotong, Chen, Chang-Wen
Efficient visual perception using mobile systems is crucial, particularly in unknown environments such as search and rescue operations, where swift and comprehensive perception of objects of interest is essential. In such real-world applications, obj
Externí odkaz:
http://arxiv.org/abs/2404.16507
Autor:
Liu, Ye, He, Jixuan, Li, Wanhua, Kim, Junsik, Wei, Donglai, Pfister, Hanspeter, Chen, Chang Wen
Video temporal grounding (VTG) is a fine-grained video understanding problem that aims to ground relevant clips in untrimmed videos given natural language queries. Most existing VTG models are built upon frame-wise final-layer CLIP features, aided by
Externí odkaz:
http://arxiv.org/abs/2404.00801
Autor:
Huang, Binyuan, Wen, Yuqing, Zhao, Yucheng, Hu, Yaosi, Liu, Yingfei, Jia, Fan, Mao, Weixin, Wang, Tiancai, Zhang, Chi, Chen, Chang Wen, Chen, Zhenzhong, Zhang, Xiangyu
Autonomous driving progress relies on large-scale annotated datasets. In this work, we explore the potential of generative models to produce vast quantities of freely-labeled data for autonomous driving applications and present SubjectDrive, the firs
Externí odkaz:
http://arxiv.org/abs/2403.19438