Výsledky vyhledávání

Report

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Autor: Yao, Huanjin, Huang, Jiaxing, Wu, Wenhao, Zhang, Jingyi, Wang, Yibo, Liu, Shunyu, Wang, Yingjie, Song, Yuxin, Feng, Haocheng, Shen, Li, Tao, Dacheng

In this work, we aim to develop an MLLM that understands and solves questions by learning to create each intermediate step of the reasoning involved till the final answer. To this end, we propose Collective Monte Carlo Tree Search (CoMCTS), a new lea

Externí odkaz: http://arxiv.org/abs/2412.18319

Zobrazit plný text záznamu

Report

Multi-Agent Sampling: Scaling Inference Compute for Data Synthesis with Tree Search-Based Agentic Collaboration

Autor: Ye, Hai, Lin, Mingbao, Ng, Hwee Tou, Yan, Shuicheng

Scaling laws for inference compute in multi-agent systems remain under-explored compared to single-agent scenarios. This work aims to bridge this gap by investigating the problem of data synthesis through multi-agent sampling, where synthetic respons

Externí odkaz: http://arxiv.org/abs/2412.17061

Zobrazit plný text záznamu

Report

Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning

Autor: Park, Sungjin, Liu, Xiao, Gong, Yeyun, Choi, Edward

Despite recent advances in large language models, open-source models often struggle to consistently perform well on complex reasoning tasks. Existing ensemble methods, whether applied at the token or output levels, fail to address these challenges. I

Externí odkaz: http://arxiv.org/abs/2412.15797

Zobrazit plný text záznamu

Report

Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling

Autor: Li, Junyi, Ng, Hwee Tou

Despite their outstanding capabilities, large language models (LLMs) are prone to hallucination and producing factually incorrect information. This challenge has spurred efforts in attributed text generation, which prompts LLMs to generate content wi

Externí odkaz: http://arxiv.org/abs/2412.14860

Zobrazit plný text záznamu

Report

Seed-CTS: Unleashing the Power of Tree Search for Superior Performance in Competitive Coding Tasks

Autor: Wang, Hao, Liu, Boyi, Zhang, Yufeng, Chen, Jie

Competition-level code generation tasks pose significant challenges for current state-of-the-art large language models (LLMs). For example, on the LiveCodeBench-Hard dataset, models such as O1-Mini and O1-Preview achieve pass@1 rates of only 0.366 an

Externí odkaz: http://arxiv.org/abs/2412.12544

Zobrazit plný text záznamu

Report

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Autor: Cheng, Jiale, Liu, Xiao, Wang, Cunxiang, Gu, Xiaotao, Lu, Yida, Zhang, Dan, Dong, Yuxiao, Tang, Jie, Wang, Hongning, Huang, Minlie

Instruction-following is a fundamental capability of language models, requiring the model to recognize even the most subtle requirements in the instructions and accurately reflect them in its output. Such an ability is well-suited for and often optim

Externí odkaz: http://arxiv.org/abs/2412.11605

Zobrazit plný text záznamu

Report

Monte Carlo Tree Search with Spectral Expansion for Planning with Dynamical Systems

Autor: Riviere, Benjamin, Lathrop, John, Chung, Soon-Jo

Publikováno v: Science Robotics, 4 Dec 2024, Vol 9, Issue 97

The ability of a robot to plan complex behaviors with real-time computation, rather than adhering to predesigned or offline-learned routines, alleviates the need for specialized algorithms or training for each problem instance. Monte Carlo Tree Searc

Externí odkaz: http://arxiv.org/abs/2412.11270

Zobrazit plný text záznamu

Report

Enhancing LLM Reasoning with Reward-guided Tree Search

Autor: Jiang, Jinhao, Chen, Zhipeng, Min, Yingqian, Chen, Jie, Cheng, Xiaoxue, Wang, Jiapeng, Tang, Yiru, Sun, Haoxiang, Deng, Jia, Zhao, Wayne Xin, Liu, Zheng, Yan, Dong, Xie, Jian, Wang, Zhongyuan, Wen, Ji-Rong

Recently, test-time scaling has garnered significant attention from the research community, largely due to the substantial advancements of the o1 model released by OpenAI. By allocating more computational resources during the inference phase, large l

Externí odkaz: http://arxiv.org/abs/2411.11694

Zobrazit plný text záznamu

Report

Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search

Autor: Li, Shuangtao, Dong, Shuaihao, Luan, Kexin, Di, Xinhan, Ding, Chaofan

Large language models (LLMs) have demonstrated their remarkable capacity across a variety of tasks. However, reasoning remains a challenge for LLMs. To improve LLMs' reasoning ability, process supervision has proven to be better than outcome supervis

Externí odkaz: http://arxiv.org/abs/2501.01478

Zobrazit plný text záznamu

Report

HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving

Autor: Li, Yang, Du, Dong, Song, Linfeng, Li, Chen, Wang, Weikang, Yang, Tao, Mi, Haitao

We introduce HunyuanProver, an language model finetuned from the Hunyuan 7B for interactive automatic theorem proving with LEAN4. To alleviate the data sparsity issue, we design a scalable framework to iterative synthesize data with low cost. Besides

Externí odkaz: http://arxiv.org/abs/2412.20735

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání