Zobrazeno 1 - 10
of 44
pro vyhledávání: '"Monath, Nicholas"'
Large Language Models (LLMs) can transfer their reasoning skills to smaller models by teaching them to generate the intermediate reasoning process required to solve multistep reasoning tasks. While LLMs can accurately solve reasoning tasks through a
Externí odkaz:
http://arxiv.org/abs/2410.18574
Autor:
Monath, Nicholas, Grathwohl, Will, Boratko, Michael, Fergus, Rob, McCallum, Andrew, Zaheer, Manzil
In dense retrieval, deep encoders provide embeddings for both inputs and targets, and the softmax function is used to parameterize a distribution over a large number of candidate targets (e.g., textual passages for information retrieval). Significant
Externí odkaz:
http://arxiv.org/abs/2409.01890
Autor:
Godbole, Ameya, Monath, Nicholas, Kim, Seungyeon, Rawat, Ankit Singh, McCallum, Andrew, Zaheer, Manzil
In text generation, hallucinations refer to the generation of seemingly coherent text that contradicts established knowledge. One compelling hypothesis is that hallucinations occur when a language model is given a generation task outside its parametr
Externí odkaz:
http://arxiv.org/abs/2408.10490
Cross-encoder (CE) models which compute similarity by jointly encoding a query-item pair perform better than embedding-based models (dual-encoders) at estimating query-item relevance. Existing approaches perform k-NN search with CE by approximating t
Externí odkaz:
http://arxiv.org/abs/2405.03651
Autor:
Chowdhury, Somnath Basu Roy, Monath, Nicholas, Dubey, Avinava, Zaheer, Manzil, McCallum, Andrew, Ahmed, Amr, Chaturvedi, Snigdha
Extractive opinion summarization involves automatically producing a summary of text about an entity (e.g., a product's reviews) by extracting representative sentences that capture prevalent opinions in the review set. Typically, in online marketplace
Externí odkaz:
http://arxiv.org/abs/2401.08047
Autor:
Chowdhury, Somnath Basu Roy, Monath, Nicholas, Dubey, Avinava, Ahmed, Amr, Chaturvedi, Snigdha
Distributed representations provide a vector space that captures meaningful relationships between data instances. The distributed nature of these representations, however, entangles together multiple attributes or concepts of data instances (e.g., th
Externí odkaz:
http://arxiv.org/abs/2312.00194
Autor:
Chowdhury, Somnath Basu Roy, Monath, Nicholas, Beirami, Ahmad, Kidambi, Rahul, Dubey, Avinava, Ahmed, Amr, Chaturvedi, Snigdha
Fairness, especially group fairness, is an important consideration in the context of machine learning systems. The most commonly adopted group fairness-enhancing techniques are in-processing methods that rely on a mixture of a fairness objective (e.g
Externí odkaz:
http://arxiv.org/abs/2310.11401
Cross-encoder models, which jointly encode and score a query-item pair, are prohibitively expensive for direct k-nearest neighbor (k-NN) search. Consequently, k-NN search typically employs a fast approximate retrieval (e.g. using BM25 or dual-encoder
Externí odkaz:
http://arxiv.org/abs/2305.02996
Dual encoder models are ubiquitous in modern classification and retrieval. Crucial for training such dual encoders is an accurate estimation of gradients from the partition function of the softmax over the large output space; this requires finding ne
Externí odkaz:
http://arxiv.org/abs/2303.15311
Recent years have seen a paradigm shift in NLP towards using pretrained language models ({PLM}) for a wide range of tasks. However, there are many difficult design decisions to represent structures (e.g. tagged text, coreference chains) in a way such
Externí odkaz:
http://arxiv.org/abs/2210.14698