Výsledky vyhledávání

Report

ProgressGym: Alignment with a Millennium of Moral Progress

Autor: Qiu, Tianyi, Zhang, Yang, Huang, Xuchuan, Li, Jasmine Xinze, Ji, Jiaming, Yang, Yaodong

Frontier AI systems, including large language models (LLMs), hold increasing influence over the epistemology of human users. Such influence can reinforce prevailing societal values, potentially contributing to the lock-in of misguided moral beliefs a

Externí odkaz: http://arxiv.org/abs/2406.20087

Zobrazit plný text záznamu

Report

PKU-SafeRLHF: A Safety Alignment Preference Dataset for Llama Family Models

Autor: Ji, Jiaming, Hong, Donghai, Zhang, Borong, Chen, Boyuan, Dai, Josef, Zheng, Boren, Qiu, Tianyi, Li, Boxun, Yang, Yaodong

In this work, we introduce the PKU-SafeRLHF dataset, designed to promote research on safety alignment in large language models (LLMs). As a sibling project to SafeRLHF and BeaverTails, we separate annotations of helpfulness and harmlessness for quest

Externí odkaz: http://arxiv.org/abs/2406.15513

Zobrazit plný text záznamu

Report

Language Models Resist Alignment

Autor: Ji, Jiaming, Wang, Kaile, Qiu, Tianyi, Chen, Boyuan, Zhou, Jiayi, Li, Changye, Lou, Hantao, Yang, Yaodong

Large language models (LLMs) may exhibit undesirable behaviors. Recent efforts have focused on aligning these models to prevent harmful generation. Despite these efforts, studies have shown that even a well-conducted alignment process can be easily c

Externí odkaz: http://arxiv.org/abs/2406.06144

Zobrazit plný text záznamu

Report

Reward Generalization in RLHF: A Topological Perspective

Autor: Qiu, Tianyi, Zeng, Fanzhi, Ji, Jiaming, Yan, Dong, Wang, Kaile, Zhou, Jiayi, Han, Yang, Dai, Josef, Pan, Xuehai, Yang, Yaodong

Existing alignment methods share a common topology of information flow, where reward information is collected from humans, modeled with preference learning, and used to tune language models. However, this shared topology has not been systematically c

Externí odkaz: http://arxiv.org/abs/2402.10184

Zobrazit plný text záznamu

Report

Aligner: Efficient Alignment by Learning to Correct

Autor: Ji, Jiaming, Chen, Boyuan, Lou, Hantao, Hong, Donghai, Zhang, Borong, Pan, Xuehai, Dai, Juntao, Qiu, Tianyi, Yang, Yaodong

With the rapid development of large language models (LLMs) and ever-evolving practical requirements, finding an efficient and effective alignment method has never been more critical. However, the tension between the complexity of current alignment me

Externí odkaz: http://arxiv.org/abs/2402.02416

Zobrazit plný text záznamu

Report

AI Alignment: A Comprehensive Survey

AI alignment aims to make AI systems behave in line with human intentions and values. As AI systems grow more capable, so do risks from misalignment. To provide a comprehensive and up-to-date overview of the alignment field, in this survey, we delve

Externí odkaz: http://arxiv.org/abs/2310.19852

Zobrazit plný text záznamu

Akademický článek

Arbuscular mycorrhizal fungal interactions bridge the support of root‐associated microbiota for slope multifunctionality in an erosion‐prone ecosystem.

Autor: Qiu, Tianyi^1,2,3 (AUTHOR), Peñuelas, Josep^4,5 (AUTHOR), Chen, Yinglong^1,2,6 (AUTHOR), Sardans, Jordi^4,5 (AUTHOR), Yu, Jialuo⁷ (AUTHOR), Xu, Zhiyuan^1,2 (AUTHOR), Cui, Qingliang⁸ (AUTHOR), Liu, Ji⁹ (AUTHOR), Cui, Yongxing¹⁰ (AUTHOR), Zhao, Shuling⁸ (AUTHOR), Chen, Jing¹¹ (AUTHOR), Wang, Yunqiang¹² (AUTHOR), Fang, Linchuan^1,3,8,12 (AUTHOR) flinc629@hotmail.com

Publikováno v: iMeta. Jun2024, Vol. 3 Issue 3, p1-19. 19p.

Zobrazit plný text záznamu

Plný text ve formátu HTML

Vyhledávací nástroje:

Upřesnit hledání