Zobrazeno 1 - 10
of 8 254
pro vyhledávání: '"ZHOU, QIANG"'
Multimodal Sentiment Analysis (MSA) leverages heterogeneous modalities, such as language, vision, and audio, to enhance the understanding of human sentiment. While existing models often focus on extracting shared information across modalities or dire
Externí odkaz:
http://arxiv.org/abs/2412.12225
The planar Tur\'an number of $H$, denoted by $ex_{\mathcal{P}}(n,H)$, is the maximum number of edges in an $n$-vertex $H$-free planar graph. The planar Tur\'an number of $k(k\geq 3)$ vertex-disjoint union of cycles is the trivial value $3n-6$. Lan, S
Externí odkaz:
http://arxiv.org/abs/2411.18487
Numerical reasoning is pivotal in various artificial intelligence applications, such as natural language processing and recommender systems, where it involves using entities, relations, and attribute values (e.g., weight, length) to infer new factual
Externí odkaz:
http://arxiv.org/abs/2411.12950
Existing text-to-video (T2V) models often struggle with generating videos with sufficiently pronounced or complex actions. A key limitation lies in the text prompt's inability to precisely convey intricate motion details. To address this, we propose
Externí odkaz:
http://arxiv.org/abs/2411.08328
Autor:
Zhang, Xueying, Zhang, Bin, Wei, Shihai, Li, Hao, Liao, Jinyu, Zhou, Tao, Deng, Guangwei, Wang, You, Song, Haizhi, You, Lixing, Fan, Boyu, Fan, Yunru, Chen, Feng, Guo, Guangcan, Zhou, Qiang
Light-matter interface is an important building block for long-distance quantum networks. Towards a scalable quantum network with high-rate quantum information processing, it requires to develop integrated light-matter interfaces with broadband and m
Externí odkaz:
http://arxiv.org/abs/2410.18516
Autor:
Zheng, Mulin, Ale, Shizhuo, Chen, Peiqin, Tu, Jingpu, Zhou, Qiang, Song, Haizhi, Wang, You, Wang, Junfeng, Guo, Guangcan, Deng, Guangwei
The interface with spin defects in hexagonal boron nitride has recently become a promising platform and has shown great potential in a wide range of quantum technologies. Varieties of spin properties of $V_B^-$ defects in hexagonal boron nitride (hBN
Externí odkaz:
http://arxiv.org/abs/2410.06755
Cross-lingual cross-modal retrieval (CCR) aims to retrieve visually relevant content based on non-English queries, without relying on human-labeled cross-modal data pairs during training. One popular approach involves utilizing machine translation (M
Externí odkaz:
http://arxiv.org/abs/2409.19961
Autor:
Ma, Yiwei, Ji, Jiayi, Ye, Ke, Lin, Weihuang, Wang, Zhibin, Zheng, Yonghan, Zhou, Qiang, Sun, Xiaoshuai, Ji, Rongrong
Significant progress has been made in the field of Instruction-based Image Editing (IIE). However, evaluating these models poses a significant challenge. A crucial requirement in this field is the establishment of a comprehensive evaluation benchmark
Externí odkaz:
http://arxiv.org/abs/2408.14180
With advancements in data availability and computing resources, Multimodal Large Language Models (MLLMs) have showcased capabilities across various fields. However, the quadratic complexity of the vision encoder in MLLMs constrains the resolution of
Externí odkaz:
http://arxiv.org/abs/2407.16198