Zobrazeno 1 - 10
of 67
pro vyhledávání: '"Wang, Shenzhi"'
Autor:
Yue, Yang, Wang, Yulin, Kang, Bingyi, Han, Yizeng, Wang, Shenzhi, Song, Shiji, Feng, Jiashi, Huang, Gao
MLLMs have demonstrated remarkable comprehension and reasoning capabilities with complex language and visual data. These advances have spurred the vision of establishing a generalist robotic MLLM proficient in understanding complex human instructions
Externí odkaz:
http://arxiv.org/abs/2411.02359
Autor:
Lin, Matthieu, Sheng, Jenny, Zhao, Andrew, Wang, Shenzhi, Yue, Yang, Wu, Yiran, Liu, Huan, Liu, Jun, Huang, Gao, Liu, Yong-Jin
In a compound AI system, components such as an LLM call, a retriever, a code interpreter, or tools are interconnected. The system's behavior is primarily driven by parameters such as instructions or tool definitions. Recent advancements enable end-to
Externí odkaz:
http://arxiv.org/abs/2410.16392
Autor:
Wang, Huanqian, Yue, Yang, Lu, Rui, Shi, Jingxin, Zhao, Andrew, Wang, Shenzhi, Song, Shiji, Huang, Gao
Large Language Models (LLMs) have demonstrated great potential as generalist assistants, showcasing powerful task understanding and problem-solving capabilities. To deploy LLMs as AI assistants, it is crucial that these models exhibit desirable behav
Externí odkaz:
http://arxiv.org/abs/2407.08770
Autor:
Zhao, Andrew, Xu, Quentin, Lin, Matthieu, Wang, Shenzhi, Liu, Yong-jin, Zheng, Zilong, Huang, Gao
Recent advances in large language models (LLMs) have made them indispensable, raising significant concerns over managing their safety. Automated red teaming offers a promising alternative to the labor-intensive and error-prone manual probing for vuln
Externí odkaz:
http://arxiv.org/abs/2405.19026
Autor:
Yang, Qisen, Wang, Zekun, Chen, Honghui, Wang, Shenzhi, Pu, Yifan, Gao, Xin, Huang, Wenhao, Song, Shiji, Huang, Gao
Psychological measurement is essential for mental health, self-understanding, and personal development. Traditional methods, such as self-report scales and psychologist interviews, often face challenges with engagement and accessibility. While game-b
Externí odkaz:
http://arxiv.org/abs/2402.12326
Autor:
Wang, Shenzhi, Yang, Qisen, Gao, Jiawei, Lin, Matthieu Gaetan, Chen, Hao, Wu, Liwei, Jia, Ning, Song, Shiji, Huang, Gao
Offline-to-online reinforcement learning (RL) is a training paradigm that combines pre-training on a pre-collected dataset with fine-tuning in an online environment. However, the incorporation of online fine-tuning can intensify the well-known distri
Externí odkaz:
http://arxiv.org/abs/2310.17966
Autor:
Wang, Shenzhi, Liu, Chang, Zheng, Zilong, Qi, Siyuan, Chen, Shuo, Yang, Qisen, Zhao, Andrew, Wang, Chaofei, Song, Shiji, Huang, Gao
Recent breakthroughs in large language models (LLMs) have brought remarkable success in the field of LLM-as-Agent. Nevertheless, a prevalent assumption is that the information processed by LLMs is consistently honest, neglecting the pervasive decepti
Externí odkaz:
http://arxiv.org/abs/2310.01320
Offline reinforcement learning (RL) optimizes the policy on a previously collected dataset without any interactions with the environment, yet usually suffers from the distributional shift problem. To mitigate this issue, a typical solution is to impo
Externí odkaz:
http://arxiv.org/abs/2309.01448
Training practical agents usually involve offline and online reinforcement learning (RL) to balance the policy's performance and interaction costs. In particular, online fine-tuning has become a commonly used method to correct the erroneous estimates
Externí odkaz:
http://arxiv.org/abs/2306.03362
Publikováno v:
In Heliyon 15 May 2024 10(9)