Výsledky vyhledávání

Report

Scene-Text Grounding for Text-Based Video Question Answering

Autor: Zhou, Sheng, Xiao, Junbin, Yang, Xun, Song, Peipei, Guo, Dan, Yao, Angela, Wang, Meng, Chua, Tat-Seng

Existing efforts in text-based video question answering (TextVideoQA) are criticized for their opaque decisionmaking and heavy reliance on scene-text recognition. In this paper, we propose to study Grounded TextVideoQA by forcing models to answer que

Externí odkaz: http://arxiv.org/abs/2409.14319

Zobrazit plný text záznamu

Report

Dual-stream Feature Augmentation for Domain Generalization

Autor: Wang, Shanshan, ALuSi, Yang, Xun, Xu, Ke, Tan, Huibin, Zhang, Xingyi

Domain generalization (DG) task aims to learn a robust model from source domains that could handle the out-of-distribution (OOD) issue. In order to improve the generalization ability of the model in unseen domains, increasing the diversity of trainin

Externí odkaz: http://arxiv.org/abs/2409.04699

Zobrazit plný text záznamu

Report

GRPose: Learning Graph Relations for Human Image Generation with Pose Priors

Autor: Yin, Xiangchen, Di, Donglin, Fan, Lei, Li, Hao, Wei, Chen, Gou, Xiaofei, Song, Yang, Sun, Xiao, Yang, Xun

Recent methods using diffusion models have made significant progress in human image generation with various additional controls such as pose priors. However, existing approaches still struggle to generate high-quality images with consistent pose alig

Externí odkaz: http://arxiv.org/abs/2408.16540

Zobrazit plný text záznamu

Report

Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks

Autor: Diao, Yunfeng, Zhai, Naixin, Miao, Changtao, Yang, Xun, Wang, Meng

Recent advancements in image synthesis, particularly with the advent of GAN and Diffusion models, have amplified public concerns regarding the dissemination of disinformation. To address such concerns, numerous AI-generated Image (AIGI) Detectors hav

Externí odkaz: http://arxiv.org/abs/2407.20836

Zobrazit plný text záznamu

Report

Advancing Prompt Learning through an External Layer

Autor: Cui, Fangming, Yang, Xun, Wu, Chao, Xiao, Liang, Tian, Xinmei

Prompt learning represents a promising method for adapting pre-trained vision-language models (VLMs) to various downstream tasks by learning a set of text embeddings. One challenge inherent to these methods is the poor generalization performance due

Externí odkaz: http://arxiv.org/abs/2407.19674

Zobrazit plný text záznamu

Report

Towards Scale-Aware Full Surround Monodepth with Transformers

Autor: Yang, Yuchen, Wang, Xinyi, Li, Dong, Tian, Lu, Sirasao, Ashish, Yang, Xun

Full surround monodepth (FSM) methods can learn from multiple camera views simultaneously in a self-supervised manner to predict the scale-aware depth, which is more practical for real-world applications in contrast to scale-ambiguous depth from a st

Externí odkaz: http://arxiv.org/abs/2407.10406

Zobrazit plný text záznamu

Report

Boosting Adversarial Transferability for Skeleton-based Action Recognition via Exploring the Model Posterior Space

Autor: Diao, Yunfeng, Wu, Baiqi, Zhang, Ruixuan, Yang, Xun, Wang, Meng, Wang, He

Skeletal motion plays a pivotal role in human activity recognition (HAR). Recently, attack methods have been proposed to identify the universal vulnerability of skeleton-based HAR(S-HAR). However, the research of adversarial transferability on S-HAR

Externí odkaz: http://arxiv.org/abs/2407.08572

Zobrazit plný text záznamu

Report

TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation

Autor: Luo, Chaofan, Di, Donglin, Yang, Xun, Ma, Yongjia, Xue, Zhou, Wei, Chen, Liu, Yebin

Despite significant strides in the field of 3D scene editing, current methods encounter substantial challenge, particularly in preserving 3D consistency in multi-view editing process. To tackle this challenge, we propose a progressive 3D editing stra

Externí odkaz: http://arxiv.org/abs/2407.02034

Zobrazit plný text záznamu

Report

Gradually Vanishing Gap in Prototypical Network for Unsupervised Domain Adaptation

Autor: Wang, Shanshan, Zhou, Hao, Yang, Xun, He, Zhenwei, Wang, Mengzhu, Zhang, Xingyi, Wang, Meng

Unsupervised domain adaptation (UDA) is a critical problem for transfer learning, which aims to transfer the semantic information from labeled source domain to unlabeled target domain. Recent advancements in UDA models have demonstrated significant g

Externí odkaz: http://arxiv.org/abs/2405.17774

Zobrazit plný text záznamu

Report

Dual-State Personalized Knowledge Tracing with Emotional Incorporation

Autor: Wang, Shanshan, Yuan, Fangzheng, Wang, Keyang, Yang, Xun, Zhang, Xingyi, Wang, Meng

Knowledge tracing has been widely used in online learning systems to guide the students' future learning. However, most existing KT models primarily focus on extracting abundant information from the question sets and explore the relationships between

Externí odkaz: http://arxiv.org/abs/2405.16799

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání