Zobrazeno 1 - 10
of 35
pro vyhledávání: '"Xing, Jiazheng"'
Autor:
Xing, Jiazheng, Xu, Chao, Qian, Yijie, Liu, Yang, Dai, Guang, Sun, Baigui, Liu, Yong, Wang, Jingdong
Virtual try-on focuses on adjusting the given clothes to fit a specific person seamlessly while avoiding any distortion of the patterns and textures of the garment. However, the clothing identity uncontrollability and training inefficiency of existin
Externí odkaz:
http://arxiv.org/abs/2404.00878
Autor:
Hou, Xiaojun, Xing, Jiazheng, Qian, Yijie, Guo, Yaowei, Xin, Shuo, Chen, Junhao, Tang, Kai, Wang, Mengmeng, Jiang, Zhengkai, Liu, Liang, Liu, Yong
Multimodal Visual Object Tracking (VOT) has recently gained significant attention due to its robustness. Early research focused on fully fine-tuning RGB-based trackers, which was inefficient and lacked generalized representation due to the scarcity o
Externí odkaz:
http://arxiv.org/abs/2403.16002
Autor:
Xu, Chao, Liu, Yang, Xing, Jiazheng, Wang, Weida, Sun, Mingze, Dan, Jun, Huang, Tianxin, Li, Siyuan, Cheng, Zhi-Qi, Tai, Ying, Sun, Baigui
In this paper, we abstract the process of people hearing speech, extracting meaningful cues, and creating various dynamically audio-consistent talking faces, termed Listening and Imagining, into the task of high-fidelity diverse talking faces generat
Externí odkaz:
http://arxiv.org/abs/2403.01901
Autor:
Wang, Mengmeng, Xing, Jiazheng, Jiang, Boyuan, Chen, Jun, Mei, Jianbiao, Zuo, Xingxing, Dai, Guang, Wang, Jingdong, Liu, Yong
Publikováno v:
AAAI2024
Recently, the rise of large-scale vision-language pretrained models like CLIP, coupled with the technology of Parameter-Efficient FineTuning (PEFT), has captured substantial attraction in video action recognition. Nevertheless, prevailing approaches
Externí odkaz:
http://arxiv.org/abs/2401.11649
Autor:
Xing, Jiazheng, Wang, Mengmeng, Ruan, Yudi, Chen, Bofan, Guo, Yaowei, Mu, Boyu, Dai, Guang, Wang, Jingdong, Liu, Yong
Class prototype construction and matching are core aspects of few-shot action recognition. Previous methods mainly focus on designing spatiotemporal relation modeling modules or complex temporal alignment algorithms. Despite the promising results, th
Externí odkaz:
http://arxiv.org/abs/2308.09346
Applying large-scale pre-trained visual models like CLIP to few-shot action recognition tasks can benefit performance and efficiency. Utilizing the "pre-training, fine-tuning" paradigm makes it possible to avoid training a network from scratch, which
Externí odkaz:
http://arxiv.org/abs/2308.01532
Spatial and temporal modeling is one of the most core aspects of few-shot action recognition. Most previous works mainly focus on long-term temporal relation modeling based on high-level spatial representations, without considering the crucial low-le
Externí odkaz:
http://arxiv.org/abs/2301.07944
The canonical approach to video action recognition dictates a neural model to do a classic and standard 1-of-N majority vote task. They are trained to predict a fixed set of predefined categories, limiting their transferable ability on new datasets w
Externí odkaz:
http://arxiv.org/abs/2109.08472
Autor:
Tang, Qing, Sensale, Sebastian, Bond, Charles, Xing, Jiazheng, Qiao, Andy, Hugelier, Siewert, Arab, Arian, Arya, Gaurav, Lakadamyali, Melike
Publikováno v:
In Current Biology 4 December 2023 33(23):5169-5184
Publikováno v:
In Procedia Manufacturing 2021 54:269-273