Výsledky vyhledávání - "Jiang, Boyuan"

Report

Oracle Bone Inscriptions Multi-modal Dataset

Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography. However, the task of deciphering OBI, in the current climate of the scholarship, can

Externí odkaz: http://arxiv.org/abs/2407.03900

Zobrazit plný text záznamu

Report

NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models

Autor: Wu, Kai, Jiang, Boyuan, Jiang, Zhengkai, He, Qingdong, Luo, Donghao, Wang, Shengzhi, Liu, Qingwen, Wang, Chengjie

Multimodal large language models (MLLMs) contribute a powerful mechanism to understanding visual information building on large language models. However, MLLMs are notorious for suffering from hallucinations, especially when generating lengthy, detail

Externí odkaz: http://arxiv.org/abs/2405.20081

Zobrazit plný text záznamu

Report

M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition

Autor: Wang, Mengmeng, Xing, Jiazheng, Jiang, Boyuan, Chen, Jun, Mei, Jianbiao, Zuo, Xingxing, Dai, Guang, Wang, Jingdong, Liu, Yong

Publikováno v: AAAI2024

Recently, the rise of large-scale vision-language pretrained models like CLIP, coupled with the technology of Parameter-Efficient FineTuning (PEFT), has captured substantial attraction in video action recognition. Nevertheless, prevailing approaches

Externí odkaz: http://arxiv.org/abs/2401.11649

Zobrazit plný text záznamu

Report

PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization

Autor: Peng, Xu, Zhu, Junwei, Jiang, Boyuan, Tai, Ying, Luo, Donghao, Zhang, Jiangning, Lin, Wei, Jin, Taisong, Wang, Chengjie, Ji, Rongrong

Recent advancements in personalized image generation using diffusion models have been noteworthy. However, existing methods suffer from inefficiencies due to the requirement for subject-specific fine-tuning. This computationally intensive process hin

Externí odkaz: http://arxiv.org/abs/2312.06354

Zobrazit plný text záznamu

Report

Probabilistic Triangulation for Uncalibrated Multi-View 3D Human Pose Estimation

Autor: Jiang, Boyuan, Hu, Lei, Xia, Shihong

3D human pose estimation has been a long-standing challenge in computer vision and graphics, where multi-view methods have significantly progressed but are limited by the tedious calibration processes. Existing multi-view methods are restricted to fi

Externí odkaz: http://arxiv.org/abs/2309.04756

Zobrazit plný text záznamu

Report

Dynamic Frame Interpolation in Wavelet Domain

Autor: Kong, Lingtong, Jiang, Boyuan, Luo, Donghao, Chu, Wenqing, Tai, Ying, Wang, Chengjie, Yang, Jie

Video frame interpolation is an important low-level vision task, which can increase frame rate for more fluent visual experience. Existing methods have achieved great success by employing advanced motion models and synthesis networks. However, the sp

Externí odkaz: http://arxiv.org/abs/2309.03508

Zobrazit plný text záznamu

Report

Pose-aware Attention Network for Flexible Motion Retargeting by Body Part

Autor: Hu, Lei, Zhang, Zihao, Zhong, Chongyang, Jiang, Boyuan, Xia, Shihong

Motion retargeting is a fundamental problem in computer graphics and computer vision. Existing approaches usually have many strict requirements, such as the source-target skeletons needing to have the same number of joints or share the same topology.

Externí odkaz: http://arxiv.org/abs/2306.08006

Zobrazit plný text záznamu

Report

IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation

Autor: Kong, Lingtong, Jiang, Boyuan, Luo, Donghao, Chu, Wenqing, Huang, Xiaoming, Tai, Ying, Wang, Chengjie, Yang, Jie

Prevailing video frame interpolation algorithms, that generate the intermediate frames from consecutive inputs, typically rely on complex model architectures with heavy parameters or large delay, hindering them from diverse real-time applications. In

Externí odkaz: http://arxiv.org/abs/2205.14620

Zobrazit plný text záznamu

Report

Learning Comprehensive Motion Representation for Action Recognition

Autor: Wu, Mingyu, Jiang, Boyuan, Luo, Donghao, Yan, Junchi, Wang, Yabiao, Tai, Ying, Wang, Chengjie, Li, Jilin, Huang, Feiyue, Yang, Xiaokang

For action recognition learning, 2D CNN-based methods are efficient but may yield redundant features due to applying the same 2D convolution kernel to each frame. Recent efforts attempt to capture motion information by establishing inter-frame connec

Externí odkaz: http://arxiv.org/abs/2103.12278

Zobrazit plný text záznamu

Report

Multi-Level Adaptive Region of Interest and Graph Learning for Facial Action Unit Recognition

Autor: Yan, Jingwei, Jiang, Boyuan, Wang, Jingjing, Li, Qiang, Wang, Chunmao, Pu, Shiliang

In facial action unit (AU) recognition tasks, regional feature learning and AU relation modeling are two effective aspects which are worth exploring. However, the limited representation capacity of regional features makes it difficult for relation mo

Externí odkaz: http://arxiv.org/abs/2102.12154

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání