Zobrazeno 1 - 10
of 26
pro vyhledávání: '"Chen, Sanxing"'
Large Language Models (LLMs) are often augmented with external information as contexts, but this external information can sometimes be inaccurate or even intentionally misleading. We argue that robust LLMs should demonstrate situated faithfulness, dy
Externí odkaz:
http://arxiv.org/abs/2410.14675
We show that existing evaluations for fake news detection based on conventional sources, such as claims on fact-checking websites, result in an increasing accuracy over time for LLM-based detectors -- even after their knowledge cutoffs. This suggests
Externí odkaz:
http://arxiv.org/abs/2410.14651
Autor:
Bai, Yu, Zou, Xiyuan, Huang, Heyan, Chen, Sanxing, Rondeau, Marc-Antoine, Gao, Yang, Cheung, Jackie Chi Kit
Long sequence modeling has gained broad interest as large language models (LLMs) continue to advance. Recent research has identified that a large portion of hidden states within the key-value caches of Transformer models can be discarded (also termed
Externí odkaz:
http://arxiv.org/abs/2406.12018
Autor:
Stureborg, Rickard, Chen, Sanxing, Xie, Ruoyu, Patel, Aayushi, Li, Christopher, Zhu, Chloe Qinyu, Hu, Tingnan, Yang, Jun, Dhingra, Bhuwan
One way to personalize chatbot interactions is by establishing common ground with the intended reader. A domain where establishing mutual understanding could be particularly impactful is vaccine concerns and misinformation. Vaccine interventions are
Externí odkaz:
http://arxiv.org/abs/2405.10861
The desire and ability to seek new information strategically are fundamental to human learning but often overlooked in current language agent evaluation. We analyze a popular web shopping task designed to test language agents' ability to perform stra
Externí odkaz:
http://arxiv.org/abs/2404.09911
Autor:
Bai, Yu, Huang, Heyan, Piano, Cesare Spinoso-Di, Rondeau, Marc-Antoine, Chen, Sanxing, Gao, Yang, Cheung, Jackie Chi Kit
In-context learning (ICL) has become an effective solution for few-shot learning in natural language processing. However, our understanding of ICL's working mechanisms is limited, specifically regarding how models learn to perform tasks from ICL demo
Externí odkaz:
http://arxiv.org/abs/2401.11323
Temporal and numerical expression understanding is of great importance in many downstream Natural Language Processing (NLP) and Information Retrieval (IR) tasks. However, much previous work covers only a few sub-types and focuses only on entity extra
Externí odkaz:
http://arxiv.org/abs/2303.18103
Learning transferable representation of knowledge graphs (KGs) is challenging due to the heterogeneous, multi-relational nature of graph structures. Inspired by Transformer-based pretrained language models' success on learning transferable representa
Externí odkaz:
http://arxiv.org/abs/2303.15682
In Text-to-SQL semantic parsing, selecting the correct entities (tables and columns) for the generated SQL query is both crucial and challenging; the parser is required to connect the natural language (NL) question and the SQL query to the structured
Externí odkaz:
http://arxiv.org/abs/2009.14809
This paper examines the challenging problem of learning representations of entities and relations in a complex multi-relational knowledge graph. We propose HittER, a Hierarchical Transformer model to jointly learn Entity-relation composition and Rela
Externí odkaz:
http://arxiv.org/abs/2008.12813