Zobrazeno 1 - 10
of 5 553
pro vyhledávání: '"Zhang, YuXin"'
Despite the efficiency of prompt learning in transferring vision-language models (VLMs) to downstream tasks, existing methods mainly learn the prompts in a coarse-grained manner where the learned prompt vectors are shared across all categories. Conse
Externí odkaz:
http://arxiv.org/abs/2412.08176
Video-to-music generation presents significant potential in video production, requiring the generated music to be both semantically and rhythmically aligned with the video. Achieving this alignment demands advanced music generation capabilities, soph
Externí odkaz:
http://arxiv.org/abs/2412.06296
Autor:
Xu, Yu, Tang, Fan, Cao, Juan, Zhang, Yuxin, Kong, Xiaoyu, Li, Jintao, Deussen, Oliver, Lee, Tong-Yee
Diffusion Transformers (DiTs) have exhibited robust capabilities in image generation tasks. However, accurate text-guided image editing for multimodal DiTs (MM-DiTs) still poses a significant challenge. Unlike UNet-based structures that could utilize
Externí odkaz:
http://arxiv.org/abs/2411.15034
Autor:
Zhang, Yuxin, Zheng, Dandan, Gong, Biao, Chen, Jingdong, Yang, Ming, Dong, Weiming, Xu, Changsheng
Lighting plays a pivotal role in ensuring the naturalness of video generation, significantly influencing the aesthetic quality of the generated content. However, due to the deep coupling between lighting and the temporal features of videos, it remain
Externí odkaz:
http://arxiv.org/abs/2410.22979
Object detection algorithms are pivotal components of unmanned aerial vehicle (UAV) imaging systems, extensively employed in complex fields. However, images captured by high-mobility UAVs often suffer from motion blur cases, which significantly imped
Externí odkaz:
http://arxiv.org/abs/2410.17822
Autor:
Zhang, Yuxin, Lin, Zheng, Chen, Zhe, Fang, Zihan, Zhu, Wenjun, Chen, Xianhao, Zhao, Jin, Gao, Yue
Traditional federated learning (FL) frameworks rely heavily on terrestrial networks, where coverage limitations and increasing bandwidth congestion significantly hinder model convergence. Fortunately, the advancement of low-Earth orbit (LEO) satellit
Externí odkaz:
http://arxiv.org/abs/2409.13503
Ultrasound imaging, despite its widespread use in medicine, often suffers from various sources of noise and artifacts that impact the signal-to-noise ratio and overall image quality. Enhancing ultrasound images requires a delicate balance between con
Externí odkaz:
http://arxiv.org/abs/2409.11380
Ultrafast Plane-Wave (PW) imaging often produces artifacts and shadows that vary with insonification angles. We propose a novel approach using Implicit Neural Representations (INRs) to compactly encode multi-planar sequences while preserving crucial
Externí odkaz:
http://arxiv.org/abs/2409.11370
Autor:
Tan, Ashton Yu Xuan, Yang, Yingkai, Zhang, Xiaofei, Li, Bowen, Gao, Xiaorong, Zheng, Sifa, Wang, Jianqiang, Gu, Xinyu, Li, Jun, Zhao, Yang, Zhang, Yuxin, Stathaki, Tania
Enhancing the safety of autonomous vehicles is crucial, especially given recent accidents involving automated systems. As passengers in these vehicles, humans' sensory perception and decision-making can be integrated with autonomous systems to improv
Externí odkaz:
http://arxiv.org/abs/2408.16315
Autor:
Liu, Xinyu, Shen, Shuyu, Li, Boyan, Ma, Peixian, Jiang, Runzhi, Zhang, Yuxin, Fan, Ju, Li, Guoliang, Tang, Nan, Luo, Yuyu
Translating users' natural language queries (NL) into SQL queries (i.e., NL2SQL, a.k.a., Text-to-SQL) can significantly reduce barriers to accessing relational databases and support various commercial applications. The performance of NL2SQL has been
Externí odkaz:
http://arxiv.org/abs/2408.05109