Zobrazeno 1 - 10
of 72
pro vyhledávání: '"Cheung, Jackie Chi Kit"'
Autor:
Bai, Yu, Zou, Xiyuan, Huang, Heyan, Chen, Sanxing, Rondeau, Marc-Antoine, Gao, Yang, Cheung, Jackie Chi Kit
Long sequence modeling has gained broad interest as large language models (LLMs) continue to advance. Recent research has identified that a large portion of hidden states within the key-value caches of Transformer models can be discarded (also termed
Externí odkaz:
http://arxiv.org/abs/2406.12018
Autor:
Liu, Yu Lu, Blodgett, Su Lin, Cheung, Jackie Chi Kit, Liao, Q. Vera, Olteanu, Alexandra, Xiao, Ziang
Benchmarking is seen as critical to assessing progress in NLP. However, creating a benchmark involves many design decisions (e.g., which datasets to include, which metrics to use) that often rely on tacit, untested assumptions about what the benchmar
Externí odkaz:
http://arxiv.org/abs/2406.08723
All state-of-the-art coreference resolution (CR) models involve finetuning a pretrained language model. Whether the superior performance of one CR model over another is due to the choice of language model or other factors, such as the task-specific a
Externí odkaz:
http://arxiv.org/abs/2404.00727
State-of-the-art language models (LMs) sometimes generate non-factual hallucinations that misalign with world knowledge. To explore the mechanistic causes of these hallucinations, we create diagnostic datasets with subject-relation queries and adapt
Externí odkaz:
http://arxiv.org/abs/2403.18167
Assessing the quality of summarizers poses significant challenges. In response, we propose a novel task-oriented evaluation approach that assesses summarizers based on their capacity to produce summaries that are useful for downstream tasks, while pr
Externí odkaz:
http://arxiv.org/abs/2402.19457
Autor:
Bai, Yu, Huang, Heyan, Piano, Cesare Spinoso-Di, Rondeau, Marc-Antoine, Chen, Sanxing, Gao, Yang, Cheung, Jackie Chi Kit
In-context learning (ICL) has become an effective solution for few-shot learning in natural language processing. However, our understanding of ICL's working mechanisms is limited, specifically regarding how models learn to perform tasks from ICL demo
Externí odkaz:
http://arxiv.org/abs/2401.11323
Autor:
Liu, Yu Lu, Cao, Meng, Blodgett, Su Lin, Cheung, Jackie Chi Kit, Olteanu, Alexandra, Trischler, Adam
AI and NLP publication venues have increasingly encouraged researchers to reflect on possible ethical considerations, adverse impacts, and other responsible AI issues their work might engender. However, for specific NLP tasks our understanding of how
Externí odkaz:
http://arxiv.org/abs/2311.11103
While large language models (LLMs) have achieved impressive performance in generating fluent and realistic text, controlling the generated text so that it exhibits properties such as safety, factuality, and non-toxicity remains challenging. % such as
Externí odkaz:
http://arxiv.org/abs/2311.04921
We present V\=arta, a large-scale multilingual dataset for headline generation in Indic languages. This dataset includes 41.8 million news articles in 14 different Indic languages (and English), which come from a variety of high-quality sources. To t
Externí odkaz:
http://arxiv.org/abs/2305.05858
It is increasingly common to evaluate the same coreference resolution (CR) model on multiple datasets. Do these multi-dataset evaluations allow us to draw meaningful conclusions about model generalization? Or, do they rather reflect the idiosyncrasie
Externí odkaz:
http://arxiv.org/abs/2303.09092