Výsledky vyhledávání - "Madasu, Avinash"

Report

Is Your Paper Being Reviewed by an LLM? Investigating AI Text Detectability in Peer Review

Autor: Yu, Sungduk, Luo, Man, Madasu, Avinash, Lal, Vasudev, Howard, Phillip

Peer review is a critical process for ensuring the integrity of published scientific research. Confidence in this process is predicated on the assumption that experts in the relevant domain give careful consideration to the merits of manuscripts whic

Externí odkaz: http://arxiv.org/abs/2410.03019

Zobrazit plný text záznamu

Report

Quantifying and Enabling the Interpretability of CLIP-like Models

Autor: Madasu, Avinash, Gandelsman, Yossi, Lal, Vasudev, Howard, Phillip

CLIP is one of the most popular foundational models and is heavily used for many vision-language tasks. However, little is known about the inner workings of CLIP. To bridge this gap we propose a study to quantify the interpretability in CLIP like mod

Externí odkaz: http://arxiv.org/abs/2409.06579

Zobrazit plný text záznamu

Report

SocialCounterfactuals: Probing and Mitigating Intersectional Social Biases in Vision-Language Models with Counterfactual Examples

Autor: Howard, Phillip, Madasu, Avinash, Le, Tiep, Moreno, Gustavo Lujan, Bhiwandiwalla, Anahita, Lal, Vasudev

While vision-language models (VLMs) have achieved remarkable performance improvements recently, there is growing evidence that these models also posses harmful biases with respect to social attributes such as gender and race. Prior studies have prima

Externí odkaz: http://arxiv.org/abs/2312.00825

Zobrazit plný text záznamu

Report

Analyzing Zero-Shot Abilities of Vision-Language Models on Video Understanding Tasks

Autor: Madasu, Avinash, Bhiwandiwalla, Anahita, Lal, Vasudev

Foundational multimodal models pre-trained on large scale image-text pairs or video-text pairs or both have shown strong generalization abilities on downstream tasks. However unlike image-text models, pretraining video-text models is always not feasi

Externí odkaz: http://arxiv.org/abs/2310.04914

Zobrazit plný text záznamu

Report

Probing Intersectional Biases in Vision-Language Models with Counterfactual Examples

Autor: Howard, Phillip, Madasu, Avinash, Le, Tiep, Moreno, Gustavo Lujan, Lal, Vasudev

Externí odkaz: http://arxiv.org/abs/2310.02988

Zobrazit plný text záznamu

Report

Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations

Autor: Haydarov, Kilichbek, Shen, Xiaoqian, Madasu, Avinash, Salem, Mahmoud, Li, Li-Jia, Elsayed, Gamaleldin, Elhoseiny, Mohamed

We introduce Affective Visual Dialog, an emotion explanation and reasoning task as a testbed for research on understanding the formation of emotions in visually grounded conversations. The task involves three skills: (1) Dialog-based Question Answeri

Externí odkaz: http://arxiv.org/abs/2308.16349

Zobrazit plný text záznamu

Report

ICSVR: Investigating Compositional and Syntactic Understanding in Video Retrieval Models

Autor: Madasu, Avinash, Lal, Vasudev

Video retrieval (VR) involves retrieving the ground truth video from the video database given a text caption or vice-versa. The two important components of compositionality: objects & attributes and actions are joined using correct syntax to form a p

Externí odkaz: http://arxiv.org/abs/2306.16533

Zobrazit plný text záznamu

Report

A Unified Framework for Slot based Response Generation in a Multimodal Dialogue System

Autor: Firdaus, Mauajama, Madasu, Avinash, Ekbal, Asif

Natural Language Understanding (NLU) and Natural Language Generation (NLG) are the two critical components of every conversational system that handles the task of understanding the user by capturing the necessary information in the form of slots and

Externí odkaz: http://arxiv.org/abs/2305.17433

Zobrazit plný text záznamu

Report

Is Multimodal Vision Supervision Beneficial to Language?

Autor: Madasu, Avinash, Lal, Vasudev

Vision (image and video) - Language (VL) pre-training is the recent popular paradigm that achieved state-of-the-art results on multi-modal tasks like image-retrieval, video-retrieval, visual question answering etc. These models are trained in an unsu

Externí odkaz: http://arxiv.org/abs/2302.05016

Zobrazit plný text záznamu

Report

What do Large Language Models Learn beyond Language?

Autor: Madasu, Avinash, Srivastava, Shashank

Large language models (LMs) have rapidly become a mainstay in Natural Language Processing. These models are known to acquire rich linguistic knowledge from training on large amounts of text. In this paper, we investigate if pre-training on text also

Externí odkaz: http://arxiv.org/abs/2210.12302

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání