Zobrazeno 1 - 10
of 49
pro vyhledávání: '"Chandrasekaran, Varun"'
We introduce {\em generative monoculture}, a behavior observed in large language models (LLMs) characterized by a significant narrowing of model output diversity relative to available training data for a given task: for example, generating only posit
Externí odkaz:
http://arxiv.org/abs/2407.02209
Autor:
Wu, Qilong, Chandrasekaran, Varun
Watermarking approaches are proposed to identify if text being circulated is human or large language model (LLM) generated. The state-of-the-art watermarking strategy of Kirchenbauer et al. (2023a) biases the LLM to generate specific (``green'') toke
Externí odkaz:
http://arxiv.org/abs/2403.14719
Pretrained language models (PLMs) have shown remarkable few-shot learning capabilities when provided with properly formatted examples. However, selecting the "best" examples remains an open challenge. We propose a complexity-based prompt selection ap
Externí odkaz:
http://arxiv.org/abs/2403.03861
Autor:
Wu, Fan, Inan, Huseyin A., Backurs, Arturs, Chandrasekaran, Varun, Kulkarni, Janardhan, Sim, Robert
Positioned between pre-training and user deployment, aligning large language models (LLMs) through reinforcement learning (RL) has emerged as a prevailing strategy for training instruction following-models such as ChatGPT. In this work, we initiate t
Externí odkaz:
http://arxiv.org/abs/2310.16960
Autor:
Abdin, Marah I, Gunasekar, Suriya, Chandrasekaran, Varun, Li, Jerry, Yuksekgonul, Mert, Peshawaria, Rahee Ghosh, Naik, Ranjita, Nushi, Besmira
We study the ability of state-of-the art models to answer constraint satisfaction queries for information retrieval (e.g., 'a list of ice cream shops in San Diego'). In the past, such queries were considered to be tasks that could only be solved via
Externí odkaz:
http://arxiv.org/abs/2310.15511
Membership Inference Attacks (MIAs) aim to identify specific data samples within the private training dataset of machine learning models, leading to serious privacy violations and other sophisticated threats. Many practical black-box MIAs require que
Externí odkaz:
http://arxiv.org/abs/2310.08015
Large language models (LLMs) are documented to struggle in settings that require complex reasoning. Nevertheless, instructing the model to break down the problem into smaller reasoning steps, or ensembling various generations through modifying decodi
Externí odkaz:
http://arxiv.org/abs/2310.07088
Autor:
Jones, Erik, Palangi, Hamid, Simões, Clarisse, Chandrasekaran, Varun, Mukherjee, Subhabrata, Mitra, Arindam, Awadallah, Ahmed, Kamar, Ece
Large language models (LLMs) frequently hallucinate on abstractive summarization tasks such as document-based question-answering, meeting summarization, and clinical report generation, even though all necessary information is included in context. How
Externí odkaz:
http://arxiv.org/abs/2310.06827
Autor:
Yuksekgonul, Mert, Chandrasekaran, Varun, Jones, Erik, Gunasekar, Suriya, Naik, Ranjita, Palangi, Hamid, Kamar, Ece, Nushi, Besmira
We investigate the internal behavior of Transformer-based Large Language Models (LLMs) when they generate factually incorrect text. We propose modeling factual queries as constraint satisfaction problems and use this framework to investigate how the
Externí odkaz:
http://arxiv.org/abs/2309.15098
Autor:
Bubeck, Sébastien, Chandrasekaran, Varun, Eldan, Ronen, Gehrke, Johannes, Horvitz, Eric, Kamar, Ece, Lee, Peter, Lee, Yin Tat, Li, Yuanzhi, Lundberg, Scott, Nori, Harsha, Palangi, Hamid, Ribeiro, Marco Tulio, Zhang, Yi
Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest mo
Externí odkaz:
http://arxiv.org/abs/2303.12712