Zobrazeno 1 - 10
of 194
pro vyhledávání: '"MUKHERJEE, Subhabrata"'
Autor:
Jiang, Yifan, Aggarwal, Kriti, Laud, Tanmay, Munir, Kashif, Pujara, Jay, Mukherjee, Subhabrata
The rapid progress of Large Language Models (LLMs) has opened up new opportunities across various domains and applications; yet it also presents challenges related to potential misuse. To mitigate such risks, red teaming has been employed as a proact
Externí odkaz:
http://arxiv.org/abs/2409.17458
Autor:
Ding, Dujian, Mallick, Ankur, Wang, Chi, Sim, Robert, Mukherjee, Subhabrata, Ruhle, Victor, Lakshmanan, Laks V. S., Awadallah, Ahmed Hassan
Large language models (LLMs) excel in most NLP tasks but also require expensive cloud servers for deployment due to their size, while smaller models that can be deployed on lower cost (e.g., edge) devices, tend to lag behind in terms of response qual
Externí odkaz:
http://arxiv.org/abs/2404.14618
Autor:
Mukherjee, Subhabrata, Gamble, Paul, Ausin, Markel Sanz, Kant, Neel, Aggarwal, Kriti, Manjunath, Neha, Datta, Debajyoti, Liu, Zhengliang, Ding, Jiayuan, Busacca, Sophia, Bianco, Cezanne, Sharma, Swapnil, Lasko, Rae, Voisard, Michelle, Harneja, Sanchay, Filippova, Darya, Meixiong, Gerry, Cha, Kevin, Youssefi, Amir, Buvanesh, Meyhaa, Weingram, Howard, Bierman-Lytle, Sebastian, Mangat, Harpreet Singh, Parikh, Kim, Godil, Saad, Miller, Alex
We develop Polaris, the first safety-focused LLM constellation for real-time patient-AI healthcare conversations. Unlike prior LLM works in healthcare focusing on tasks like question answering, our work specifically focuses on long multi-turn voice c
Externí odkaz:
http://arxiv.org/abs/2403.13313
Autor:
Jones, Erik, Palangi, Hamid, Simões, Clarisse, Chandrasekaran, Varun, Mukherjee, Subhabrata, Mitra, Arindam, Awadallah, Ahmed, Kamar, Ece
Large language models (LLMs) frequently hallucinate on abstractive summarization tasks such as document-based question-answering, meeting summarization, and clinical report generation, even though all necessary information is included in context. How
Externí odkaz:
http://arxiv.org/abs/2310.06827
Autor:
Pham, Hai, Kim, Young Jin, Mukherjee, Subhabrata, Woodruff, David P., Poczos, Barnabas, Awadalla, Hany Hassan
Mixture-of-experts (MoE) architecture has been proven a powerful method for diverse tasks in training deep models in many applications. However, current MoE implementations are task agnostic, treating all tokens from different tasks in the same manne
Externí odkaz:
http://arxiv.org/abs/2308.15772
Autor:
Del Corro, Luciano, Del Giorno, Allie, Agarwal, Sahaj, Yu, Bin, Awadallah, Ahmed, Mukherjee, Subhabrata
Autoregressive large language models (LLMs) have made remarkable progress in various natural language generation tasks. However, they incur high computation cost and latency resulting from the autoregressive token-by-token generation. To address this
Externí odkaz:
http://arxiv.org/abs/2307.02628
State-of-the-art few-shot learning (FSL) methods leverage prompt-based fine-tuning to obtain remarkable results for natural language understanding (NLU) tasks. While much of the prior FSL methods focus on improving downstream task performance, there
Externí odkaz:
http://arxiv.org/abs/2306.11066
Autor:
Mukherjee, Subhabrata, Mitra, Arindam, Jawahar, Ganesh, Agarwal, Sahaj, Palangi, Hamid, Awadallah, Ahmed
Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imit
Externí odkaz:
http://arxiv.org/abs/2306.02707
Autor:
Jin, Woojeong, Mukherjee, Subhabrata, Cheng, Yu, Shen, Yelong, Chen, Weizhu, Awadallah, Ahmed Hassan, Jose, Damien, Ren, Xiang
Generalization to unseen tasks is an important ability for few-shot learners to achieve better zero-/few-shot performance on diverse tasks. However, such generalization to vision-language tasks including grounding and generation tasks has been under-
Externí odkaz:
http://arxiv.org/abs/2305.14676
Augmented Language Models (ALMs) blend the reasoning capabilities of Large Language Models (LLMs) with tools that allow for knowledge retrieval and action execution. Existing ALM systems trigger LLM thought processes while pulling observations from t
Externí odkaz:
http://arxiv.org/abs/2305.18323