Zobrazeno 1 - 10
of 968
pro vyhledávání: '"Wang, Guoxin"'
Generating images with accurately represented text, especially in non-Latin languages, poses a significant challenge for diffusion models. Existing approaches, such as the integration of hint condition diagrams via auxiliary networks (e.g., ControlNe
Externí odkaz:
http://arxiv.org/abs/2409.17524
In audio-driven video generation, creating Mandarin videos presents significant challenges. Collecting comprehensive Mandarin datasets is difficult, and the complex lip movements in Mandarin further complicate model training compared to English. In t
Externí odkaz:
http://arxiv.org/abs/2409.13268
Video-based physiology, exemplified by remote photoplethysmography (rPPG), extracts physiological signals such as pulse and respiration by analyzing subtle changes in video recordings. This non-contact, real-time monitoring method holds great potenti
Externí odkaz:
http://arxiv.org/abs/2409.09366
Wearable Internet of Things (IoT) devices are gaining ground for continuous physiological data acquisition and health monitoring. These physiological signals can be used for security applications to achieve continuous authentication and user convenie
Externí odkaz:
http://arxiv.org/abs/2409.05627
Publikováno v:
Information Processing & Management, Volume 61, Issue 4, July 2024, 103716
In this paper, we propose a novel visual Semantic-Spatial Self-Highlighting Network (termed 3SHNet) for high-precision, high-efficiency and high-generalization image-sentence retrieval. 3SHNet highlights the salient identification of prominent object
Externí odkaz:
http://arxiv.org/abs/2404.17273
Autor:
Xiong, Yizhe, Chen, Hui, Hao, Tianxiang, Lin, Zijia, Han, Jungong, Zhang, Yuesong, Wang, Guoxin, Bao, Yongjun, Ding, Guiguang
Recently, the scale of transformers has grown rapidly, which introduces considerable challenges in terms of training overhead and inference efficiency in the scope of task adaptation. Existing works, namely Parameter-Efficient Fine-Tuning (PEFT) and
Externí odkaz:
http://arxiv.org/abs/2403.09192
Publikováno v:
Journal of Intelligent Manufacturing and Special Equipment, 2024, Vol. 5, Issue 1, pp. 55-69.
Externí odkaz:
http://www.emeraldinsight.com/doi/10.1108/JIMSE-02-2024-0004
Unsupervised learning methods have become increasingly important in deep learning due to their demonstrated large utilization of datasets and higher accuracy in computer vision and natural language processing tasks. There is a growing trend to extend
Externí odkaz:
http://arxiv.org/abs/2310.11153
Autor:
Wang, Guoxin, Cao, Xuyang, An, Shan, Fan, Fengmei, Zhang, Chao, Wang, Jinsong, Yu, Feng, Wang, Zhiren
Deep learning approaches, together with neuroimaging techniques, play an important role in psychiatric disorders classification. Previous studies on psychiatric disorders diagnosis mainly focus on using functional connectivity matrices of resting-sta
Externí odkaz:
http://arxiv.org/abs/2310.02690
Autor:
Lv, Tengchao, Huang, Yupan, Chen, Jingye, Zhao, Yuzhong, Jia, Yilin, Cui, Lei, Ma, Shuming, Chang, Yaoyao, Huang, Shaohan, Wang, Wenhui, Dong, Li, Luo, Weiyao, Wu, Shaoxiang, Wang, Guoxin, Zhang, Cha, Wei, Furu
The automatic reading of text-intensive images represents a significant advancement toward achieving Artificial General Intelligence (AGI). In this paper we present KOSMOS-2.5, a multimodal literate model for machine reading of text-intensive images.
Externí odkaz:
http://arxiv.org/abs/2309.11419