Výsledky vyhledávání - "Jiang, Xinghua"

Report

HRVDA: High-Resolution Visual Document Assistant

Autor: Liu, Chaohu, Yin, Kun, Cao, Haoyu, Jiang, Xinghua, Li, Xin, Liu, Yinsong, Jiang, Deqiang, Sun, Xing, Xu, Linli

Leveraging vast training data, multimodal large language models (MLLMs) have demonstrated formidable general visual comprehension capabilities and achieved remarkable performance across various tasks. However, their performance in visual document und

Externí odkaz: http://arxiv.org/abs/2404.06918

Zobrazit plný text záznamu

Report

Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models

Autor: Li, Xin, Wu, Yunfei, Jiang, Xinghua, Guo, Zhihao, Gong, Mingming, Cao, Haoyu, Liu, Yinsong, Jiang, Deqiang, Sun, Xing

Recently, the advent of Large Visual-Language Models (LVLMs) has received increasing attention across various domains, particularly in the field of visual document understanding (VDU). Different from conventional vision-language tasks, VDU is specifi

Externí odkaz: http://arxiv.org/abs/2402.19014

Zobrazit plný text záznamu

Report

Abyss Aerosols

Autor: Jiang, Xinghua, Rotily, Lucas, Villermaux, Emmanuel, Wang, Xiaofei

Bubble bursting on water surfaces is believed to be a main mechanism to produce submicron drops, including sea spray aerosols, which play a critical role in forming cloud and transferring various biological and chemical substances from water to the a

Externí odkaz: http://arxiv.org/abs/2310.16551

Zobrazit plný text záznamu

Report

AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes

Autor: Li, Zhaohui, Wang, Haitao, Jiang, Xinghua

We propose a method named AudioFormer,which learns audio feature representations through the acquisition of discrete acoustic codes and subsequently fine-tunes them for audio classification tasks. Initially,we introduce a novel perspective by conside

Externí odkaz: http://arxiv.org/abs/2308.07221

Zobrazit plný text záznamu

Akademický článek

Collaborative security assessment of cloud-edge-device distributed systems based on order parameters

Autor: Qigang FAN, Zhongyuan JIANG, Xinghua LI, Jianfeng MA

Publikováno v: 网络与信息安全学报, Vol 10, Iss 3, Pp 38-51 (2024)

Distributed computing systems based on cloud-edge-device have been successfully serving thousands of applications and have become mainstream, characterized by a wide audience, high user experience requirements, and high security expectations. However

Externí odkaz: https://doaj.org/article/769dda6cf8ce4a3282921393c177dcbc

Zobrazit plný text záznamu

Report

OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification

Autor: Liu, Ye, Qiao, Lingfeng, Yin, Di, Jiang, Zhuoxuan, Jiang, Xinghua, Jiang, Deqiang, Ren, Bo

Scene segmentation and classification (SSC) serve as a critical step towards the field of video structuring analysis. Intuitively, jointly learning of these two tasks can promote each other by sharing common information. However, scene segmentation c

Externí odkaz: http://arxiv.org/abs/2207.01241

Zobrazit plný text záznamu

Report

The Devil is in the Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-Training

Autor: Liu, Hao, Jiang, Xinghua, Li, Xin, Guo, Antai, Jiang, Deqiang, Ren, Bo

The self-supervised Masked Image Modeling (MIM) schema, following "mask-and-reconstruct" pipeline of recovering contents from masked image, has recently captured the increasing interest in the multimedia community, owing to the excellent ability of l

Externí odkaz: http://arxiv.org/abs/2204.08227

Zobrazit plný text záznamu

Akademický článek

Ultrasonic humidifier aerosols: Observed high heavy metal enrichment and a new emission control method

Autor: Zhang, Tao, Lu, Xiaohui, Zhang, Ruoyu, Jiang, Xinghua, Yang, Shanye, Ma, Xiewen, Gao, Qianqian, Wang, Xiaofei

Publikováno v: In Journal of Environmental Sciences February 2025 148:298-305

Zobrazit plný text záznamu

Report

NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition

Autor: Liu, Hao, Jiang, Xinghua, Li, Xin, Bao, Zhimin, Jiang, Deqiang, Ren, Bo

Recently, Vision Transformers (ViT), with the self-attention (SA) as the de facto ingredients, have demonstrated great potential in the computer vision community. For the sake of trade-off between efficiency and performance, a group of works merely p

Externí odkaz: http://arxiv.org/abs/2111.12994

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání