Zobrazeno 1 - 10
of 30
pro vyhledávání: '"Jain, Vinija"'
Visual Question-Answering (VQA) has become a key use-case in several applications to aid user experience, particularly after Vision-Language Models (VLMs) achieving good results in zero-shot inference. But evaluating different VLMs for an application
Externí odkaz:
http://arxiv.org/abs/2409.09269
Autor:
Barman, Niyar R, Sharma, Krish, Aziz, Ashhar, Bajpai, Shashwat, Biswas, Shwetangshu, Sharma, Vasu, Jain, Vinija, Chadha, Aman, Sheth, Amit, Das, Amitava
The rapid advancement of text-to-image generation systems, exemplified by models like Stable Diffusion, Midjourney, Imagen, and DALL-E, has heightened concerns about their potential misuse. In response, companies like Meta and Google have intensified
Externí odkaz:
http://arxiv.org/abs/2408.10446
Assessing the effectiveness of large language models (LLMs) in addressing diverse tasks is essential for comprehending their strengths and weaknesses. Conventional evaluation techniques typically apply a single prompting strategy uniformly across dat
Externí odkaz:
http://arxiv.org/abs/2406.12644
The rapid rise of Language Models (LMs) has expanded their use in several applications. Yet, due to constraints of model size, associated cost, or proprietary restrictions, utilizing state-of-the-art (SOTA) LLMs is not always feasible. With open, sma
Externí odkaz:
http://arxiv.org/abs/2406.11402
Autor:
Das, Amit, Zhang, Zheng, Jamshidi, Fatemeh, Jain, Vinija, Chadha, Aman, Raychawdhary, Nilanjana, Sandage, Mary, Pope, Lauramarie, Dozier, Gerry, Seals, Cheryl
Data annotation, the practice of assigning descriptive labels to raw data, is pivotal in optimizing the performance of machine learning models. However, it is a resource-intensive process susceptible to biases introduced by annotators. The emergence
Externí odkaz:
http://arxiv.org/abs/2406.11109
This review paper provides a comprehensive overview of large language model (LLM) research directions within Indic languages. Indic languages are those spoken in the Indian subcontinent, including India, Pakistan, Bangladesh, Sri Lanka, Nepal, and Bh
Externí odkaz:
http://arxiv.org/abs/2406.09559
An image is often said to be worth a thousand words, and certain images can tell rich and insightful stories. Can these stories be told via image captioning? Images from folklore genres, such as mythology, folk dance, cultural signs, and symbols, are
Externí odkaz:
http://arxiv.org/abs/2405.17475
Despite the crucial importance of accelerating text generation in large language models (LLMs) for efficiently producing content, the sequential nature of this process often leads to high inference latency, posing challenges for real-time application
Externí odkaz:
http://arxiv.org/abs/2405.13019
The rapid advancement of foundation models (FMs) across language, image, audio, and video domains has shown remarkable capabilities in diverse tasks. However, the proliferation of FMs brings forth a critical challenge: the potential to generate hallu
Externí odkaz:
http://arxiv.org/abs/2405.09589
The rise of deep learning has marked significant progress in fields such as computer vision, natural language processing, and medical imaging, primarily through the adaptation of pre-trained models for specific tasks. Traditional fine-tuning methods,
Externí odkaz:
http://arxiv.org/abs/2404.13506