Zobrazeno 1 - 10
of 303
pro vyhledávání: '"Newman, Benjamin A."'
Autor:
Newman, Benjamin, Lee, Yoonjoo, Naik, Aakanksha, Siangliulue, Pao, Fok, Raymond, Kim, Juho, Weld, Daniel S., Chang, Joseph Chee, Lo, Kyle
When conducting literature reviews, scientists often create literature review tables - tables whose rows are publications and whose columns constitute a schema, a set of aspects used to compare and contrast the papers. Can we automatically generate t
Externí odkaz:
http://arxiv.org/abs/2410.22360
Autor:
Zhao, Wenting, Goyal, Tanya, Chiu, Yu Ying, Jiang, Liwei, Newman, Benjamin, Ravichander, Abhilasha, Chandu, Khyathi, Bras, Ronan Le, Cardie, Claire, Deng, Yuntian, Choi, Yejin
While hallucinations of large language models (LLMs) prevail as a major challenge, existing evaluation benchmarks on factuality do not cover the diverse domains of knowledge that the real-world users of LLMs seek information about. To bridge this gap
Externí odkaz:
http://arxiv.org/abs/2407.17468
Autor:
Newman, Benjamin A., Gupta, Pranay, Kitani, Kris, Bisk, Yonatan, Admoni, Henny, Paxton, Chris
De gustibus non est disputandum ("there is no accounting for others' tastes") is a common Latin maxim describing how many solutions in life are determined by people's personal preferences. Many household tasks, in particular, can only be considered f
Externí odkaz:
http://arxiv.org/abs/2407.08876
Agents that assist people need to have well-initialized policies that can adapt quickly to align with their partners' reward functions. Initializing policies to maximize performance with unknown partners can be achieved by bootstrapping nonlinear mod
Externí odkaz:
http://arxiv.org/abs/2404.10733
Over the past decade, the intricacies of sports-related concussions among female athletes have become readily apparent. Traditional clinical methods for diagnosing concussions suffer limitations when applied to female athletes, often failing to captu
Externí odkaz:
http://arxiv.org/abs/2401.13045
Autor:
West, Peter, Lu, Ximing, Dziri, Nouha, Brahman, Faeze, Li, Linjie, Hwang, Jena D., Jiang, Liwei, Fisher, Jillian, Ravichander, Abhilasha, Chandu, Khyathi, Newman, Benjamin, Koh, Pang Wei, Ettinger, Allyson, Choi, Yejin
The recent wave of generative AI has sparked unprecedented global attention, with both excitement and concern over potentially superhuman levels of artificial intelligence: models now take only seconds to produce outputs that would challenge or excee
Externí odkaz:
http://arxiv.org/abs/2311.00059
Many real-world applications (e.g., note taking, search) require extracting a sentence or paragraph from a document and showing that snippet to a human outside of the source document. Yet, users may find snippets difficult to understand as they lack
Externí odkaz:
http://arxiv.org/abs/2305.14772
Traditionally, writing assistance systems have focused on short or even single-word suggestions. Recently, large language models like GPT-3 have made it possible to generate significantly longer natural-sounding suggestions, offering more advanced as
Externí odkaz:
http://arxiv.org/abs/2302.13382
Autor:
Liang, Percy, Bommasani, Rishi, Lee, Tony, Tsipras, Dimitris, Soylu, Dilara, Yasunaga, Michihiro, Zhang, Yian, Narayanan, Deepak, Wu, Yuhuai, Kumar, Ananya, Newman, Benjamin, Yuan, Binhang, Yan, Bobby, Zhang, Ce, Cosgrove, Christian, Manning, Christopher D., Ré, Christopher, Acosta-Navas, Diana, Hudson, Drew A., Zelikman, Eric, Durmus, Esin, Ladhak, Faisal, Rong, Frieda, Ren, Hongyu, Yao, Huaxiu, Wang, Jue, Santhanam, Keshav, Orr, Laurel, Zheng, Lucia, Yuksekgonul, Mert, Suzgun, Mirac, Kim, Nathan, Guha, Neel, Chatterji, Niladri, Khattab, Omar, Henderson, Peter, Huang, Qian, Chi, Ryan, Xie, Sang Michael, Santurkar, Shibani, Ganguli, Surya, Hashimoto, Tatsunori, Icard, Thomas, Zhang, Tianyi, Chaudhary, Vishrav, Wang, William, Li, Xuechen, Mai, Yifan, Zhang, Yuhui, Koreeda, Yuta
Publikováno v:
Published in Transactions on Machine Learning Research (TMLR), 2023
Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency
Externí odkaz:
http://arxiv.org/abs/2211.09110
Recent work (e.g. LAMA (Petroni et al., 2019)) has found that the quality of the factual information extracted from Large Language Models (LLMs) depends on the prompts used to query them. This inconsistency is problematic because different users will
Externí odkaz:
http://arxiv.org/abs/2110.07280