Zobrazeno 1 - 10
of 7 550
pro vyhledávání: '"Vasudev As"'
Multimodal models typically combine a powerful large language model (LLM) with a vision encoder and are then trained on multimodal data via instruction tuning. While this process adapts LLMs to multimodal settings, it remains unclear whether this ada
Externí odkaz:
http://arxiv.org/abs/2412.03467
Autor:
Stan, Gabriela Ben-Melech, Aflalo, Estelle, Luo, Man, Rosenman, Shachar, Le, Tiep, Paul, Sayak, Tseng, Shao-Yen, Lal, Vasudev
While Large Vision Language Models (LVLMs) have become masterly capable in reasoning over human prompts and visual inputs, they are still prone to producing responses that contain misinformation. Identifying incorrect responses that are not grounded
Externí odkaz:
http://arxiv.org/abs/2412.01487
Autor:
Gohil, Vasudev, DeLorenzo, Matthew, Nallam, Veera Vishwa Achuta Sai Venkat, See, Joey, Rajendran, Jeyavijayan
The rapid advancement of large language models (LLMs) has enabled the ability to effectively analyze and generate code nearly instantaneously, resulting in their widespread adoption in software development. Following this advancement, researchers and
Externí odkaz:
http://arxiv.org/abs/2411.16111
Autor:
Glorioso, Paolo, Anthony, Quentin, Tokpanov, Yury, Golubeva, Anna, Shyam, Vasudev, Whittington, James, Pilault, Jonathan, Millidge, Beren
In this technical report, we present the Zamba2 series -- a suite of 1.2B, 2.7B, and 7.4B parameter hybrid Mamba2-transformer models that achieve state of the art performance against the leading open-weights models of their class, while achieving sub
Externí odkaz:
http://arxiv.org/abs/2411.15242
Autor:
Ratzlaff, Neale, Olson, Matthew Lyle, Hinck, Musashi, Aflalo, Estelle, Tseng, Shao-Yen, Lal, Vasudev, Howard, Phillip
Large Multi-Modal Models (LMMs) have demonstrated impressive capabilities as general-purpose chatbots that can engage in conversations about a provided input, such as an image. However, their responses are influenced by societal biases present in the
Externí odkaz:
http://arxiv.org/abs/2411.12590
Autor:
Notton, Cassandre, Sharma, Vasudev, Trinh, Vincent Quoc-Huy, Chen, Lina, Xu, Minqi, Varma, Sonal, Hosseini, Mahdi S.
Colorectal cancer (CRC) is one of the few cancers that have an established dysplasia-carcinoma sequence that benefits from screening. Everyone over 50 years of age in Canada is eligible for CRC screening. About 20\% of those people will undergo a bio
Externí odkaz:
http://arxiv.org/abs/2411.05959
Autor:
Choubey, Prafulla Kumar, Su, Xin, Luo, Man, Peng, Xiangyu, Xiong, Caiming, Le, Tiep, Rosenman, Shachar, Lal, Vasudev, Mui, Phil, Ho, Ricky, Howard, Phillip, Wu, Chien-Sheng
Knowledge graphs (KGs) generated by large language models (LLMs) are becoming increasingly valuable for Retrieval-Augmented Generation (RAG) applications that require knowledge-intensive reasoning. However, existing KG extraction methods predominantl
Externí odkaz:
http://arxiv.org/abs/2410.16597
Autor:
Ratzlaff, Neale, Olson, Matthew Lyle, Hinck, Musashi, Tseng, Shao-Yen, Lal, Vasudev, Howard, Phillip
Large Vision Language Models (LVLMs) such as LLaVA have demonstrated impressive capabilities as general-purpose chatbots that can engage in conversations about a provided input image. However, their responses are influenced by societal biases present
Externí odkaz:
http://arxiv.org/abs/2410.13976
Peer review is a critical process for ensuring the integrity of published scientific research. Confidence in this process is predicated on the assumption that experts in the relevant domain give careful consideration to the merits of manuscripts whic
Externí odkaz:
http://arxiv.org/abs/2410.03019
CLIP is one of the most popular foundational models and is heavily used for many vision-language tasks. However, little is known about the inner workings of CLIP. To bridge this gap we propose a study to quantify the interpretability in CLIP like mod
Externí odkaz:
http://arxiv.org/abs/2409.06579