Výsledky vyhledávání - "Purushwalkam, Senthil"

Report

Trust but Verify: Programmatic VLM Evaluation in the Wild

Autor: Prabhu, Viraj, Purushwalkam, Senthil, Yan, An, Xiong, Caiming, Xu, Ran

Vision-Language Models (VLMs) often generate plausible but incorrect responses to visual queries. However, reliably quantifying the effect of such hallucinations in free-form responses to open-ended queries is challenging as it requires visually veri

Externí odkaz: http://arxiv.org/abs/2410.13121

Zobrazit plný text záznamu

Report

FaithEval: Can Your Language Model Stay Faithful to Context, Even If 'The Moon is Made of Marshmallows'

Autor: Ming, Yifei, Purushwalkam, Senthil, Pandit, Shrey, Ke, Zixuan, Nguyen, Xuan-Phi, Xiong, Caiming, Joty, Shafiq

Ensuring faithfulness to context in large language models (LLMs) and retrieval-augmented generation (RAG) systems is crucial for reliable deployment in real-world applications, as incorrect or unsupported information can erode user trust. Despite adv

Externí odkaz: http://arxiv.org/abs/2410.03727

Zobrazit plný text záznamu

Report

SFR-RAG: Towards Contextually Faithful LLMs

Autor: Nguyen, Xuan-Phi, Pandit, Shrey, Purushwalkam, Senthil, Xu, Austin, Chen, Hailin, Ming, Yifei, Ke, Zixuan, Savarese, Silvio, Xong, Caiming, Joty, Shafiq

Retrieval Augmented Generation (RAG), a paradigm that integrates external contextual information with large language models (LLMs) to enhance factual accuracy and relevance, has emerged as a pivotal area in generative AI. The LLMs used in RAG applica

Externí odkaz: http://arxiv.org/abs/2409.09916

Zobrazit plný text záznamu

Report

xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations

Autor: Qin, Can, Xia, Congying, Ramakrishnan, Krithika, Ryoo, Michael, Tu, Lifu, Feng, Yihao, Shu, Manli, Zhou, Honglu, Awadalla, Anas, Wang, Jun, Purushwalkam, Senthil, Xue, Le, Zhou, Yingbo, Wang, Huan, Savarese, Silvio, Niebles, Juan Carlos, Chen, Zeyuan, Xu, Ran, Xiong, Caiming

We present xGen-VideoSyn-1, a text-to-video (T2V) generation model capable of producing realistic scenes from textual descriptions. Building on recent advancements, such as OpenAI's Sora, we explore the latent diffusion model (LDM) architecture and i

Externí odkaz: http://arxiv.org/abs/2408.12590

Zobrazit plný text záznamu

Report

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

This report introduces xGen-MM (also known as BLIP-3), a framework for developing Large Multimodal Models (LMMs). The framework comprises meticulously curated datasets, a training recipe, model architectures, and a resulting suite of LMMs. xGen-MM, s

Externí odkaz: http://arxiv.org/abs/2408.08872

Zobrazit plný text záznamu

Report

BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models

Autor: Purushwalkam, Senthil, Gokul, Akash, Joty, Shafiq, Naik, Nikhil

Recent text-to-image generation models have demonstrated incredible success in generating images that faithfully follow input prompts. However, the requirement of using words to describe a desired concept provides limited control over the appearance

Externí odkaz: http://arxiv.org/abs/2401.13974

Zobrazit plný text záznamu

Report

Diffusion Model Alignment Using Direct Preference Optimization

Autor: Wallace, Bram, Dang, Meihua, Rafailov, Rafael, Zhou, Linqi, Lou, Aaron, Purushwalkam, Senthil, Ermon, Stefano, Xiong, Caiming, Joty, Shafiq, Naik, Nikhil

Large language models (LLMs) are fine-tuned using human comparison data with Reinforcement Learning from Human Feedback (RLHF) methods to make them better aligned with users' preferences. In contrast to LLMs, human preference learning has not been wi

Externí odkaz: http://arxiv.org/abs/2311.12908

Zobrazit plný text záznamu

Report

ConRad: Image Constrained Radiance Fields for 3D Generation from a Single Image

Autor: Purushwalkam, Senthil, Naik, Nikhil

We present a novel method for reconstructing 3D objects from a single RGB image. Our method leverages the latest image generation models to infer the hidden 3D structure while remaining faithful to the input image. While existing methods obtain impre

Externí odkaz: http://arxiv.org/abs/2311.05230

Zobrazit plný text záznamu

Report

XGen-7B Technical Report

Large Language Models (LLMs) have become ubiquitous across various domains, transforming the way we interact with information and conduct research. However, most high-performing LLMs remain confined behind proprietary walls, hindering scientific prog

Externí odkaz: http://arxiv.org/abs/2309.03450

Zobrazit plný text záznamu

Report

The Challenges of Continuous Self-Supervised Learning

Autor: Purushwalkam, Senthil, Morgado, Pedro, Gupta, Abhinav

Self-supervised learning (SSL) aims to eliminate one of the major bottlenecks in representation learning - the need for human annotations. As a result, SSL holds the promise to learn representations from data in-the-wild, i.e., without the need for f

Externí odkaz: http://arxiv.org/abs/2203.12710

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání