Zobrazeno 1 - 10
of 873
pro vyhledávání: '"ROTH, DAN"'
NAVCON: A Cognitively Inspired and Linguistically Grounded Corpus for Vision and Language Navigation
Autor:
Wanchoo, Karan, Zuo, Xiaoye, Gonzalez, Hannah, Dan, Soham, Georgakis, Georgios, Roth, Dan, Daniilidis, Kostas, Miltsakaki, Eleni
We present NAVCON, a large-scale annotated Vision-Language Navigation (VLN) corpus built on top of two popular datasets (R2R and RxR). The paper introduces four core, cognitively motivated and linguistically grounded, navigation concepts and an algor
Externí odkaz:
http://arxiv.org/abs/2412.13026
Autor:
Feng, Yu, Htut, Phu Mon, Qi, Zheng, Xiao, Wei, Mager, Manuel, Pappas, Nikolaos, Halder, Kishaloy, Li, Yang, Benajiba, Yassine, Roth, Dan
Quantifying the uncertainty in the factual parametric knowledge of Large Language Models (LLMs), especially in a black-box setting, poses a significant challenge. Existing methods, which gauge a model's uncertainty through evaluating self-consistency
Externí odkaz:
http://arxiv.org/abs/2412.09572
Language model users often issue queries that lack specification, where the context under which a query was issued -- such as the user's identity, the query's intent, and the criteria for a response to be useful -- is not explicit. For instance, a go
Externí odkaz:
http://arxiv.org/abs/2411.07237
With the ubiquity of Large Language Models (LLMs), guardrails have become crucial to detect and defend against toxic content. However, with the increasing pervasiveness of LLMs in multilingual scenarios, their effectiveness in handling multilingual t
Externí odkaz:
http://arxiv.org/abs/2410.22153
Existing math datasets evaluate the reasoning abilities of large language models (LLMs) by either using the final answer or the intermediate reasoning steps derived from static examples. However, the former approach fails to surface model's uses of s
Externí odkaz:
http://arxiv.org/abs/2410.19056
Autor:
Liu, Siyi, Ning, Qiang, Halder, Kishaloy, Xiao, Wei, Qi, Zheng, Htut, Phu Mon, Zhang, Yi, John, Neha Anna, Min, Bonan, Benajiba, Yassine, Roth, Dan
Open domain question answering systems frequently rely on information retrieved from large collections of text (such as the Web) to answer questions. However, such collections of text often contain conflicting information, and indiscriminately depend
Externí odkaz:
http://arxiv.org/abs/2410.12311
Existing retrieval-based reasoning approaches for large language models (LLMs) heavily rely on the density and quality of the non-parametric knowledge source to provide domain knowledge and explicit reasoning chain. However, inclusive knowledge sourc
Externí odkaz:
http://arxiv.org/abs/2410.08475
Autor:
Elangovan, Aparna, Ko, Jongwoo, Xu, Lei, Elyasi, Mahsa, Liu, Ling, Bodapati, Sravan, Roth, Dan
The effectiveness of automatic evaluation of generative models is typically measured by comparing it to human evaluation using correlation metrics. However, metrics like Krippendorff's $\alpha$ and Randolph's $\kappa$, originally designed to measure
Externí odkaz:
http://arxiv.org/abs/2410.03775
Autor:
Ou, Tianyue, Xu, Frank F., Madaan, Aman, Liu, Jiarui, Lo, Robert, Sridhar, Abishek, Sengupta, Sudipta, Roth, Dan, Neubig, Graham, Zhou, Shuyan
LLMs can now act as autonomous agents that interact with digital environments and complete specific objectives (e.g., arranging an online meeting). However, accuracy is still far from satisfactory, partly due to a lack of large-scale, direct demonstr
Externí odkaz:
http://arxiv.org/abs/2409.15637
Autor:
Zhang, Qingru, Yu, Xiaodong, Singh, Chandan, Liu, Xiaodong, Liu, Liyuan, Gao, Jianfeng, Zhao, Tuo, Roth, Dan, Cheng, Hao
Large language models (LLMs) have demonstrated remarkable performance across various real-world tasks. However, they often struggle to fully comprehend and effectively utilize their input contexts, resulting in responses that are unfaithful or halluc
Externí odkaz:
http://arxiv.org/abs/2409.10790