Výsledky vyhledávání

Report

FiVL: A Framework for Improved Vision-Language Alignment

Autor: Aflalo, Estelle, Stan, Gabriela Ben Melech, Le, Tiep, Luo, Man, Rosenman, Shachar, Paul, Sayak, Tseng, Shao-Yen, Lal, Vasudev

Large Vision Language Models (LVLMs) have achieved significant progress in integrating visual and textual inputs for multimodal reasoning. However, a recurring challenge is ensuring these models utilize visual information as effectively as linguistic

Externí odkaz: http://arxiv.org/abs/2412.14672

Zobrazit plný text záznamu

Report

Cosmic Multipoles in Galaxy Surveys Part I: How Inferences Depend on Source Counts and Masks

Autor: Oayda, Oliver T., Mittal, Vasudev, Lewis, Geraint F.

We present a new approach to constructing and fitting dipoles and higher-order multipoles in synthetic galaxy samples over the sky. Within our Bayesian paradigm, we illustrate that this technique is robust to masked skies, allowing us to make credibl

Externí odkaz: http://arxiv.org/abs/2412.12600

Zobrazit plný text záznamu

Report

Causal World Representation in the GPT Model

Autor: Rohekar, Raanan Y., Gurwicz, Yaniv, Yu, Sungduk, Lal, Vasudev

Are generative pre-trained transformer (GPT) models only trained to predict the next token, or do they implicitly learn a world model from which a sequence is generated one token at a time? We examine this question by deriving a causal interpretation

Externí odkaz: http://arxiv.org/abs/2412.07446

Zobrazit plný text záznamu

Report

Steering Large Language Models to Evaluate and Amplify Creativity

Autor: Olson, Matthew Lyle, Ratzlaff, Neale, Hinck, Musashi, Tseng, Shao-yen, Lal, Vasudev

Although capable of generating creative text, Large Language Models (LLMs) are poor judges of what constitutes "creativity". In this work, we show that we can leverage this knowledge of how to write creatively in order to better judge what is creativ

Externí odkaz: http://arxiv.org/abs/2412.06060

Zobrazit plný text záznamu

Report

Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning

Autor: Ratzlaff, Neale, Luo, Man, Su, Xin, Lal, Vasudev, Howard, Phillip

Multimodal models typically combine a powerful large language model (LLM) with a vision encoder and are then trained on multimodal data via instruction tuning. While this process adapts LLMs to multimodal settings, it remains unclear whether this ada

Externí odkaz: http://arxiv.org/abs/2412.03467

Zobrazit plný text záznamu

Report

FastRM: An efficient and automatic explainability framework for multimodal generative models

Autor: Stan, Gabriela Ben-Melech, Aflalo, Estelle, Luo, Man, Rosenman, Shachar, Le, Tiep, Paul, Sayak, Tseng, Shao-Yen, Lal, Vasudev

While Large Vision Language Models (LVLMs) have become masterly capable in reasoning over human prompts and visual inputs, they are still prone to producing responses that contain misinformation. Identifying incorrect responses that are not grounded

Externí odkaz: http://arxiv.org/abs/2412.01487

Zobrazit plný text záznamu

Report

LLMPirate: LLMs for Black-box Hardware IP Piracy

Autor: Gohil, Vasudev, DeLorenzo, Matthew, Nallam, Veera Vishwa Achuta Sai Venkat, See, Joey, Rajendran, Jeyavijayan

The rapid advancement of large language models (LLMs) has enabled the ability to effectively analyze and generate code nearly instantaneously, resulting in their widespread adoption in software development. Following this advancement, researchers and

Externí odkaz: http://arxiv.org/abs/2411.16111

Zobrazit plný text záznamu

Report

The Zamba2 Suite: Technical Report

Autor: Glorioso, Paolo, Anthony, Quentin, Tokpanov, Yury, Golubeva, Anna, Shyam, Vasudev, Whittington, James, Pilault, Jonathan, Millidge, Beren

In this technical report, we present the Zamba2 series -- a suite of 1.2B, 2.7B, and 7.4B parameter hybrid Mamba2-transformer models that achieve state of the art performance against the leading open-weights models of their class, while achieving sub

Externí odkaz: http://arxiv.org/abs/2411.15242

Zobrazit plný text záznamu

Report

Debias your Large Multi-Modal Model at Test-Time with Non-Contrastive Visual Attribute Steering

Autor: Ratzlaff, Neale, Olson, Matthew Lyle, Hinck, Musashi, Aflalo, Estelle, Tseng, Shao-Yen, Lal, Vasudev, Howard, Phillip

Large Multi-Modal Models (LMMs) have demonstrated impressive capabilities as general-purpose chatbots that can engage in conversations about a provided input, such as an image. However, their responses are influenced by societal biases present in the

Externí odkaz: http://arxiv.org/abs/2411.12590

Zobrazit plný text záznamu

Report

Efficient Self-Supervised Barlow Twins from Limited Tissue Slide Cohorts for Colonic Pathology Diagnostics

Autor: Notton, Cassandre, Sharma, Vasudev, Trinh, Vincent Quoc-Huy, Chen, Lina, Xu, Minqi, Varma, Sonal, Hosseini, Mahdi S.

Colorectal cancer (CRC) is one of the few cancers that have an established dysplasia-carcinoma sequence that benefits from screening. Everyone over 50 years of age in Canada is eligible for CRC screening. About 20\% of those people will undergo a bio

Externí odkaz: http://arxiv.org/abs/2411.05959

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání