Výsledky vyhledávání

Report

TextSquare: Scaling up Text-Centric Visual Instruction Tuning

Autor: Tang, Jingqun, Lin, Chunhui, Zhao, Zhen, Wei, Shu, Wu, Binghong, Liu, Qi, Feng, Hao, Li, Yang, Wang, Siqi, Liao, Lei, Shi, Wei, Liu, Yuliang, Liu, Hao, Xie, Yuan, Bai, Xiang, Huang, Can

Text-centric visual question answering (VQA) has made great strides with the development of Multimodal Large Language Models (MLLMs), yet open-source models still fall short of leading models like GPT4V and Gemini, partly due to a lack of extensive,

Externí odkaz: http://arxiv.org/abs/2404.12803

Zobrazit plný text záznamu

Report

Asphalt Concrete Characterization Using Digital Image Correlation: A Systematic Review of Best Practices, Applications, and Future Vision

Autor: Wang, Siqi, Zhu, Zehui, Ma, Tao, Fan, Jianwei

Digital Image Correlation (DIC) is an optical technique that measures displacement and strain by tracking pattern movement in a sequence of captured images during testing. DIC has gained recognition in asphalt pavement engineering since the early 200

Externí odkaz: http://arxiv.org/abs/2402.17074

Zobrazit plný text záznamu

Report

A Unified Framework for Connecting Noise Modeling to Boost Noise Detection

Autor: Wang, Siqi, Pham, Chau, Plummer, Bryan A.

Noisy labels can impair model performance, making the study of learning with noisy labels an important topic. Two conventional approaches are noise modeling and noise detection. However, these two methods are typically studied independently, and ther

Externí odkaz: http://arxiv.org/abs/2312.00827

Zobrazit plný text záznamu

Report

CHAMMI: A benchmark for channel-adaptive models in microscopy imaging

Autor: Chen, Zitong, Pham, Chau, Wang, Siqi, Doron, Michael, Moshkov, Nikita, Plummer, Bryan A., Caicedo, Juan C.

Most neural networks assume that input images have a fixed number of channels (three for RGB images). However, there are many settings where the number of channels may vary, such as microscopy images where the number of channels changes depending on

Externí odkaz: http://arxiv.org/abs/2310.19224

Zobrazit plný text záznamu

Report

Human-M3: A Multi-view Multi-modal Dataset for 3D Human Pose Estimation in Outdoor Scenes

Autor: Fan, Bohao, Wang, Siqi, Guo, Wenxuan, Zheng, Wenzhao, Feng, Jianjiang, Zhou, Jie

3D human pose estimation in outdoor environments has garnered increasing attention recently. However, prevalent 3D human pose datasets pertaining to outdoor scenes lack diversity, as they predominantly utilize only one type of modality (RGB image or

Externí odkaz: http://arxiv.org/abs/2308.00628

Zobrazit plný text záznamu

Report

LNL+K: Learning with Noisy Labels and Noise Source Distribution Knowledge

Autor: Wang, Siqi, Plummer, Bryan A.

Learning with noisy labels (LNL) is challenging as the model tends to memorize noisy labels, which can lead to overfitting. Many LNL methods detect clean samples by maximizing the similarity between samples in each category, which does not make any a

Externí odkaz: http://arxiv.org/abs/2306.11911

Zobrazit plný text záznamu

Report

USD: Unknown Sensitive Detector Empowered by Decoupled Objectness and Segment Anything Model

Autor: He, Yulin, Chen, Wei, Tan, Yusong, Wang, Siqi

Open World Object Detection (OWOD) is a novel and challenging computer vision task that enables object detection with the ability to detect unknown objects. Existing methods typically estimate the object likelihood with an additional objectness branc

Externí odkaz: http://arxiv.org/abs/2306.02275

Zobrazit plný text záznamu

Report

An Effective Transformer-based Solution for RSNA Intracranial Hemorrhage Detection Competition

Autor: Shang, Fangxin, Wang, Siqi, Wang, Xiaorong, Yang, Yehui

We present an effective method for Intracranial Hemorrhage Detection (IHD) which exceeds the performance of the winner solution in RSNA-IHD competition (2019). Meanwhile, our model only takes quarter parameters and ten percent FLOPs compared to the w

Externí odkaz: http://arxiv.org/abs/2205.07556

Zobrazit plný text záznamu

Report

Video Abnormal Event Detection by Learning to Complete Visual Cloze Tests

Autor: Wang, Siqi, Yu, Guang, Cai, Zhiping, Liu, Xinwang, Zhu, En, Yin, Jianping

Although deep neural networks (DNNs) enable great progress in video abnormal event detection (VAD), existing solutions typically suffer from two issues: (1) The localization of video events cannot be both precious and comprehensive. (2) The semantics

Externí odkaz: http://arxiv.org/abs/2108.02356

Zobrazit plný text záznamu

Report

Deep Anomaly Discovery From Unlabeled Videos via Normality Advantage and Self-Paced Refinement

Autor: Yu, Guang, Wang, Siqi, Cai, Zhiping, Liu, Xinwang, Xu, Chuanfu, Wu, Chengkun

While classic video anomaly detection (VAD) requires labeled normal videos for training, emerging unsupervised VAD (UVAD) aims to discover anomalies directly from fully unlabeled videos. However, existing UVAD methods still rely on shallow models to

Externí odkaz: http://arxiv.org/abs/2108.01975

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání