Zobrazeno 1 - 10
of 1 333
pro vyhledávání: '"Guo, Xun"'
Current techniques for detecting AI-generated text are largely confined to manual feature crafting and supervised binary classification paradigms. These methodologies typically lead to performance bottlenecks and unsatisfactory generalizability. Cons
Externí odkaz:
http://arxiv.org/abs/2410.20964
Autor:
Li, Zongyi, Hu, Shujie, Liu, Shujie, Zhou, Long, Choi, Jeongsoo, Meng, Lingwei, Guo, Xun, Li, Jinyu, Ling, Hefei, Wei, Furu
Text-to-video models have recently undergone rapid and substantial advancements. Nevertheless, due to limitations in data and computational resources, achieving efficient generation of long videos with rich motion dynamics remains a significant chall
Externí odkaz:
http://arxiv.org/abs/2410.20502
Autor:
Guo, Xun, Zheng, Mingwu, Hou, Liang, Gao, Yuan, Deng, Yufan, Wan, Pengfei, Zhang, Di, Liu, Yufan, Hu, Weiming, Zha, Zhengjun, Huang, Haibin, Ma, Chongyang
Text-guided image-to-video (I2V) generation aims to generate a coherent video that preserves the identity of the input image and semantically aligns with the input prompt. Existing methods typically augment pretrained text-to-video (T2V) models by ei
Externí odkaz:
http://arxiv.org/abs/2312.16693
Autor:
Wang, Yizhou, Wu, Yixuan, Tang, Shixiang, He, Weizhen, Guo, Xun, Zhu, Feng, Bai, Lei, Zhao, Rui, Wu, Jian, He, Tong, Ouyang, Wanli
Human-centric perception tasks, e.g., pedestrian detection, skeleton-based action recognition, and pose estimation, have wide industrial applications, such as metaverse and sports analysis. There is a recent surge to develop human-centric foundation
Externí odkaz:
http://arxiv.org/abs/2312.01697
Diffusion-based methods can generate realistic images and videos, but they struggle to edit existing objects in a video while preserving their appearance over time. This prevents diffusion models from being applied to natural video editing in practic
Externí odkaz:
http://arxiv.org/abs/2308.09592
Autor:
Song, Enxin, Chai, Wenhao, Wang, Guanhong, Zhang, Yucheng, Zhou, Haoyang, Wu, Feiyang, Chi, Haozhe, Guo, Xun, Ye, Tian, Zhang, Yanting, Lu, Yan, Hwang, Jenq-Neng, Wang, Gaoang
Recently, integrating video foundation models and large language models to build a video understanding system can overcome the limitations of specific pre-defined vision tasks. Yet, existing systems can only handle videos with very few frames. For lo
Externí odkaz:
http://arxiv.org/abs/2307.16449
Temporal modeling is crucial for various video learning tasks. Most recent approaches employ either factorized (2D+1D) or joint (3D) spatial-temporal operations to extract temporal contexts from the input frames. While the former is more efficient in
Externí odkaz:
http://arxiv.org/abs/2210.00132
Publikováno v:
ACS Omega, Vol 9, Iss 14, Pp 16648-16655 (2024)
Externí odkaz:
https://doaj.org/article/70eabc6857c54bc1ab33a7fd01bf3018
Publikováno v:
Biofilm, Vol 7, Iss , Pp 100175- (2024)
Staphylococcus aureus can readily form biofilm which enhances the drug-resistance, resulting in life-threatening infections involving different organs. Biofilm formation occurs due to a series of developmental events including bacterial adhesion, agg
Externí odkaz:
https://doaj.org/article/e23aa6ade62946b89befde6872d89f76
One-shot object detection aims at detecting novel objects according to merely one given instance. With extreme data scarcity, current approaches explore various feature fusions to obtain directly transferable meta-knowledge. Yet, their performances a
Externí odkaz:
http://arxiv.org/abs/2203.09093