Výsledky vyhledávání - "Zhang, Beichen"

Report

Downstream-Pretext Domain Knowledge Traceback for Active Learning

Autor: Zhang, Beichen, Li, Liang, Zha, Zheng-Jun, Luo, Jiebo, Huang, Qingming

Active learning (AL) is designed to construct a high-quality labeled dataset by iteratively selecting the most informative samples. Such sampling heavily relies on data representation, while recently pre-training is popular for robust feature learnin

Externí odkaz: http://arxiv.org/abs/2407.14720

Zobrazit plný text záznamu

Report

YuLan: An Open-source Large Language Model

Large language models (LLMs) have become the foundation of many applications, leveraging their extensive capabilities in processing and understanding natural language. While many open-source LLMs have been released with technical reports, the lack of

Externí odkaz: http://arxiv.org/abs/2406.19853

Zobrazit plný text záznamu

Report

Reinforcement Learning for Sociohydrology

Autor: Roy, Tirthankar, Srivastava, Shivendra, Zhang, Beichen

In this study, we discuss how reinforcement learning (RL) provides an effective and efficient framework for solving sociohydrology problems. The efficacy of RL for these types of problems is evident because of its ability to update policies in an ite

Externí odkaz: http://arxiv.org/abs/2405.20772

Zobrazit plný text záznamu

Report

JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models

Autor: Zhou, Kun, Zhang, Beichen, Wang, Jiapeng, Chen, Zhipeng, Zhao, Wayne Xin, Sha, Jing, Sheng, Zhichao, Wang, Shijin, Wen, Ji-Rong

Mathematical reasoning is an important capability of large language models~(LLMs) for real-world applications. To enhance this capability, existing work either collects large-scale math-related texts for pre-training, or relies on stronger LLMs (\eg

Externí odkaz: http://arxiv.org/abs/2405.14365

Zobrazit plný text záznamu

Report

Long-CLIP: Unlocking the Long-Text Capability of CLIP

Autor: Zhang, Beichen, Zhang, Pan, Dong, Xiaoyi, Zang, Yuhang, Wang, Jiaqi

Contrastive Language-Image Pre-training (CLIP) has been the cornerstone for zero-shot classification, text-image retrieval, and text-image generation by aligning image and text modalities. Despite its widespread adoption, a significant limitation of

Externí odkaz: http://arxiv.org/abs/2403.15378

Zobrazit plný text záznamu

Report

Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach

Autor: Zhang, Beichen, Wang, Xiaoxing, Qin, Xiaohan, Yan, Junchi

Supernet is a core component in many recent Neural Architecture Search (NAS) methods. It not only helps embody the search space but also provides a (relative) estimation of the final performance of candidate architectures. Thus, it is critical that t

Externí odkaz: http://arxiv.org/abs/2403.11380

Zobrazit plný text záznamu

Report

JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving

Autor: Zhao, Wayne Xin, Zhou, Kun, Zhang, Beichen, Gong, Zheng, Chen, Zhipeng, Zhou, Yuanhang, Wen, Ji-Rong, Sha, Jing, Wang, Shijin, Liu, Cong, Hu, Guoping

Although pre-trained language models~(PLMs) have recently advanced the research progress in mathematical reasoning, they are not specially designed as a capable multi-task solver, suffering from high cost for multi-task deployment (\eg a model copy f

Externí odkaz: http://arxiv.org/abs/2306.11027

Zobrazit plný text záznamu

Report

Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning

Autor: Zhang, Beichen, Zhou, Kun, Wei, Xilin, Zhao, Wayne Xin, Sha, Jing, Wang, Shijin, Wen, Ji-Rong

Chain-of-thought prompting~(CoT) and tool augmentation have been validated in recent work as effective practices for improving large language models~(LLMs) to perform step-by-step reasoning on complex math-related tasks. However, most existing math r

Externí odkaz: http://arxiv.org/abs/2306.02408

Zobrazit plný text záznamu

Report

ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models

Autor: Chen, Zhipeng, Zhou, Kun, Zhang, Beichen, Gong, Zheng, Zhao, Wayne Xin, Wen, Ji-Rong

Although large language models (LLMs) have achieved excellent performance in a variety of evaluation benchmarks, they still struggle in complex reasoning tasks which require specific knowledge and multi-hop reasoning. To improve the reasoning abiliti

Externí odkaz: http://arxiv.org/abs/2305.14323

Zobrazit plný text záznamu

Report

A Survey of Large Language Models

Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and grasping a language. As a major approach, language modelin

Externí odkaz: http://arxiv.org/abs/2303.18223

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání