Zobrazeno 1 - 10
of 16 141
pro vyhledávání: '"Shuwei An"'
Transformer-based methods have recently achieved significant success in 3D human pose estimation, owing to their strong ability to model long-range dependencies. However, relying solely on the global attention mechanism is insufficient for capturing
Externí odkaz:
http://arxiv.org/abs/2412.19676
Visual Text-to-Speech (VTTS) aims to take the environmental image as the prompt to synthesize the reverberant speech for the spoken content. The challenge of this task lies in understanding the spatial environment from the image. Many attempts have b
Externí odkaz:
http://arxiv.org/abs/2412.11409
Autor:
Shi, Shuwei, Gong, Biao, Chen, Xi, Zheng, Dandan, Tan, Shuai, Yang, Zizheng, Li, Yuyuan, He, Jingwen, Zheng, Kecheng, Chen, Jingdong, Yang, Ming, Zheng, Yinqiang
The image-to-video (I2V) generation is conditioned on the static image, which has been enhanced recently by the motion intensity as an additional control signal. These motion-aware models are appealing to generate diverse motion patterns, yet there l
Externí odkaz:
http://arxiv.org/abs/2412.05848
Autor:
Tan, Shuai, Gong, Biao, Feng, Yutong, Zheng, Kecheng, Zheng, Dandan, Shi, Shuwei, Shen, Yujun, Chen, Jingdong, Yang, Ming
Text serves as the key control signal in video generation due to its narrative nature. To render text descriptions into video clips, current video diffusion models borrow features from text encoders yet struggle with limited text comprehension. The r
Externí odkaz:
http://arxiv.org/abs/2412.03085
Autor:
Xing, Shuwei, Mirzaei, Mateen, Xia, Wenyao, Ahmed-Fazal, Inaara, Pardasani, Utsav, Jarayathne, Uditha, Illsley, Scott, Leite, Leandro Cardarelli, Fenster, Aaron, Peters, Terry M., Chen, Elvis C. S.
The 2D projective nature of X-ray radiography presents significant limitations in fluoroscopy-guided interventions, particularly the loss of depth perception and prolonged radiation exposure. Integrating magnetic trackers into these workflows is prom
Externí odkaz:
http://arxiv.org/abs/2411.07495
Autor:
He, Shuwei, Liu, Rui
Visual Text-to-Speech (VTTS) aims to take the environmental image as the prompt to synthesize reverberant speech for the spoken content. Previous works focus on the RGB modality for global environmental modeling, overlooking the potential of multi-so
Externí odkaz:
http://arxiv.org/abs/2410.14101
Autor:
Xing, Shuwei, Cool, Derek W., Tessier, David, Chen, Elvis C. S., Peters, Terry M., Fenster, Aaron
Liver tumor ablation procedures require accurate placement of the needle applicator at the tumor centroid. The lower-cost and real-time nature of ultrasound (US) has advantages over computed tomography (CT) for applicator guidance, however, in some p
Externí odkaz:
http://arxiv.org/abs/2410.02579
Autor:
Wang, Yulin, Xiong, Honglin, Sun, Kaicong, Bai, Shuwei, Dai, Ling, Ding, Zhongxiang, Liu, Jiameng, Wang, Qian, Liu, Qian, Shen, Dinggang
Multimodal brain magnetic resonance (MR) imaging is indispensable in neuroscience and neurology. However, due to the accessibility of MRI scanners and their lengthy acquisition time, multimodal MR images are not commonly available. Current MR image s
Externí odkaz:
http://arxiv.org/abs/2409.16818
Machine learning has been used to identify phase transitions in a variety of physical systems. However, there is still a lack of relevant research on non-Bloch energy braiding in non-Hermitian systems. In this work, we study non-Bloch energy braiding
Externí odkaz:
http://arxiv.org/abs/2408.01141
Autor:
Li, Zhongfu, Ma, Shaojie, Li, Shuwei, you, Oubo, Liu, Yachao, Yang, Qingdong, Xiang, Yuanjiang, Zhou, Peiheng, Zhang, Shuang
Topological singularities, such as Weyl points and Dirac points, can give rise to unidirectional propagation channels known as chiral zero modes (CZMs) when subject to a magnetic field. These CZMs are responsible for intriguing phenomena like the chi
Externí odkaz:
http://arxiv.org/abs/2407.03390