Výsledky vyhledávání

Report

EVOLvE: Evaluating and Optimizing LLMs For Exploration

Autor: Nie, Allen, Su, Yi, Chang, Bo, Lee, Jonathan N., Chi, Ed H., Le, Quoc V., Chen, Minmin

Despite their success in many domains, large language models (LLMs) remain under-studied in scenarios requiring optimal decision-making under uncertainty. This is crucial as many real-world applications, ranging from personalized recommendations to h

Externí odkaz: http://arxiv.org/abs/2410.06238

Zobrazit plný text záznamu

Report

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

Autor: Brown, Bradley, Juravsky, Jordan, Ehrlich, Ryan, Clark, Ronald, Le, Quoc V., Ré, Christopher, Mirhoseini, Azalia

Scaling the amount of compute used to train language models has dramatically improved their capabilities. However, when it comes to inference, we often limit the amount of compute to only one attempt per problem. Here, we explore inference compute as

Externí odkaz: http://arxiv.org/abs/2407.21787

Zobrazit plný text záznamu

Report

NATURAL PLAN: Benchmarking LLMs on Natural Language Planning

Autor: Zheng, Huaixiu Steven, Mishra, Swaroop, Zhang, Hugh, Chen, Xinyun, Chen, Minmin, Nova, Azade, Hou, Le, Cheng, Heng-Tze, Le, Quoc V., Chi, Ed H., Zhou, Denny

We introduce NATURAL PLAN, a realistic planning benchmark in natural language containing 3 key tasks: Trip Planning, Meeting Planning, and Calendar Scheduling. We focus our evaluation on the planning capabilities of LLMs with full information on the

Externí odkaz: http://arxiv.org/abs/2406.04520

Zobrazit plný text záznamu

Report

Long-form factuality in large language models

Autor: Wei, Jerry, Yang, Chengrun, Song, Xinying, Lu, Yifeng, Hu, Nathan, Huang, Jie, Tran, Dustin, Peng, Daiyi, Liu, Ruibo, Huang, Da, Du, Cosmo, Le, Quoc V.

Large language models (LLMs) often generate content that contains factual errors when responding to fact-seeking prompts on open-ended topics. To benchmark a model's long-form factuality in open domains, we first use GPT-4 to generate LongFact, a pro

Externí odkaz: http://arxiv.org/abs/2403.18802

Zobrazit plný text záznamu

Report

CLEAR: Cross-Transformers with Pre-trained Language Model is All you need for Person Attribute Recognition and Retrieval

Autor: Bui, Doanh C., Le, Thinh V., Ngo, Ba Hung, Choi, Tae Jong

Person attribute recognition and attribute-based retrieval are two core human-centric tasks. In the recognition task, the challenge is specifying attributes depending on a person's appearance, while the retrieval task involves searching for matching

Externí odkaz: http://arxiv.org/abs/2403.06119

Zobrazit plný text záznamu

Report

Understanding Social Perception, Interactions, and Safety Aspects of Sidewalk Delivery Robots Using Sentiment Analysis

Autor: Du, Yuchen, Le, Tho V.

This article presents a comprehensive sentiment analysis (SA) of comments on YouTube videos related to Sidewalk Delivery Robots (SDRs). We manually annotated the collected YouTube comments with three sentiment labels: negative (0), positive (1), and

Externí odkaz: http://arxiv.org/abs/2405.00688

Zobrazit plný text záznamu

Report

Self-Discover: Large Language Models Self-Compose Reasoning Structures

Autor: Zhou, Pei, Pujara, Jay, Ren, Xiang, Chen, Xinyun, Cheng, Heng-Tze, Le, Quoc V., Chi, Ed H., Zhou, Denny, Mishra, Swaroop, Zheng, Huaixiu Steven

We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods. Core to the framework is a self-discovery proce

Externí odkaz: http://arxiv.org/abs/2402.03620

Zobrazit plný text záznamu

Report

AutoNumerics-Zero: Automated Discovery of State-of-the-Art Mathematical Functions

Autor: Real, Esteban, Chen, Yao, Rossini, Mirko, de Souza, Connal, Garg, Manav, Verghese, Akhil, Firsching, Moritz, Le, Quoc V., Cubuk, Ekin Dogus, Park, David H.

Computers calculate transcendental functions by approximating them through the composition of a few limited-precision instructions. For example, an exponential can be calculated with a Taylor series. These approximation methods were developed over th

Externí odkaz: http://arxiv.org/abs/2312.08472

Zobrazit plný text záznamu

Report

Analytical model for large-scale design of sidewalk delivery robot systems

Autor: Yang, Hai, Du, Yuchen, Le, Tho V., Chow, Joseph Y. J.

With the rise in demand for local deliveries and e-commerce, robotic deliveries are being considered as efficient and sustainable solutions. However, the deployment of such systems can be highly complex due to numerous factors involving stochastic de

Externí odkaz: http://arxiv.org/abs/2310.17475

Zobrazit plný text záznamu

Report

Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models

Autor: Zheng, Huaixiu Steven, Mishra, Swaroop, Chen, Xinyun, Cheng, Heng-Tze, Chi, Ed H., Le, Quoc V, Zhou, Denny

We present Step-Back Prompting, a simple prompting technique that enables LLMs to do abstractions to derive high-level concepts and first principles from instances containing specific details. Using the concepts and principles to guide reasoning, LLM

Externí odkaz: http://arxiv.org/abs/2310.06117

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání