Výsledky vyhledávání

Report

Towards Robust Extractive Question Answering Models: Rethinking the Training Methodology

This paper proposes a novel training method to improve the robustness of Extractive Question Answering (EQA) models. Previous research has shown that existing models, when trained on EQA datasets that include unanswerable questions, demonstrate a sig

Externí odkaz: http://arxiv.org/abs/2409.19766

Zobrazit plný text záznamu

Report

Diffusion Models For Multi-Modal Generative Modeling

Autor: Chen, Changyou, Ding, Han, Sisman, Bunyamin, Xu, Yi, Xie, Ouye, Yao, Benjamin Z., Tran, Son Dinh, Zeng, Belinda

Diffusion-based generative modeling has been achieving state-of-the-art results on various generation tasks. Most diffusion models, however, are limited to a single-generation modeling. Can we generalize diffusion models with the ability of multi-mod

Externí odkaz: http://arxiv.org/abs/2407.17571

Zobrazit plný text záznamu

Report

X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs

Autor: Swetha, Sirnam, Yang, Jinyu, Neiman, Tal, Rizve, Mamshad Nayeem, Tran, Son, Yao, Benjamin, Chilimbi, Trishul, Shah, Mubarak

Recent advancements in Multimodal Large Language Models (MLLMs) have revolutionized the field of vision-language understanding by integrating visual perception capabilities into Large Language Models (LLMs). The prevailing trend in this field involve

Externí odkaz: http://arxiv.org/abs/2407.13851

Zobrazit plný text záznamu

Report

Open Vocabulary Multi-Label Video Classification

Autor: Gupta, Rohit, Rizve, Mamshad Nayeem, Unnikrishnan, Jayakrishnan, Tawari, Ashish, Tran, Son, Shah, Mubarak, Yao, Benjamin, Chilimbi, Trishul

Pre-trained vision-language models (VLMs) have enabled significant progress in open vocabulary computer vision tasks such as image classification, object detection and image segmentation. Some recent works have focused on extending VLMs to open vocab

Externí odkaz: http://arxiv.org/abs/2407.09073

Zobrazit plný text záznamu

Report

VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding

Autor: Do, Phong Nguyen-Thuan, Tran, Son Quoc, Hoang, Phu Gia, Van Nguyen, Kiet, Nguyen, Ngan Luu-Thuy

The success of Natural Language Understanding (NLU) benchmarks in various languages, such as GLUE for English, CLUE for Chinese, KLUE for Korean, and IndoNLU for Indonesian, has facilitated the evaluation of new NLU models across a wide range of task

Externí odkaz: http://arxiv.org/abs/2403.15882

Zobrazit plný text záznamu

Report

VidLA: Video-Language Alignment at Scale

Autor: Rizve, Mamshad Nayeem, Fei, Fan, Unnikrishnan, Jayakrishnan, Tran, Son, Yao, Benjamin Z., Zeng, Belinda, Shah, Mubarak, Chilimbi, Trishul

In this paper, we propose VidLA, an approach for video-language alignment at scale. There are two major limitations of previous video-language alignment approaches. First, they do not capture both short-range and long-range temporal dependencies and

Externí odkaz: http://arxiv.org/abs/2403.14870

Zobrazit plný text záznamu

Report

Deep Learning for Plant Identification and Disease Classification from Leaf Images: Multi-prediction Approaches

Autor: Yao, Jianping, Tran, Son N., Garg, Saurabh, Sawyer, Samantha

Deep learning plays an important role in modern agriculture, especially in plant pathology using leaf images where convolutional neural networks (CNN) are attracting a lot of attention. While numerous reviews have explored the applications of deep le

Externí odkaz: http://arxiv.org/abs/2310.16273

Zobrazit plný text záznamu

Report

Machine Learning for Leaf Disease Classification: Data, Techniques and Applications

Autor: Yao, Jianping, Tran, Son N., Sawyer, Samantha, Garg, Saurabh

Publikováno v: Artificial Intelligence Review 2023

The growing demand for sustainable development brings a series of information technologies to help agriculture production. Especially, the emergence of machine learning applications, a branch of artificial intelligence, has shown multiple breakthroug

Externí odkaz: http://arxiv.org/abs/2310.12509

Zobrazit plný text záznamu

Report

AGent: A Novel Pipeline for Automatically Creating Unanswerable Questions

Autor: Tran, Son Quoc, Do, Gia-Huy, Do, Phong Nguyen-Thuan, Kretchmar, Matt, Du, Xinya

The development of large high-quality datasets and high-performing models have led to significant advancements in the domain of Extractive Question Answering (EQA). This progress has sparked considerable interest in exploring unanswerable questions w

Externí odkaz: http://arxiv.org/abs/2309.05103

Zobrazit plný text záznamu

Report

UnsMOT: Unified Framework for Unsupervised Multi-Object Tracking with Geometric Topology Guidance

Autor: Tran, Son, Tran, Cong, Tran, Anh, Pham, Cuong

Object detection has long been a topic of high interest in computer vision literature. Motivated by the fact that annotating data for the multi-object tracking (MOT) problem is immensely expensive, recent studies have turned their attention to the un

Externí odkaz: http://arxiv.org/abs/2309.01078

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání