Zobrazeno 1 - 10
of 401
pro vyhledávání: '"ZHANG Youzhi"'
Autor:
CHAI Jing, HAN Zhicheng, LEI Wulin, ZHANG Dingding, MA Chenyang, SUN Kai, WENG Mingyue, ZHANG Youzhi, DING Guoli, ZHENG Zhongyou, ZHANG Yin, HAN Gang
Publikováno v:
Meitan kexue jishu, Vol 51, Iss 1, Pp 146-156 (2023)
With the deepening of coal mining and the upsizing of mining equipment, the floor heave of the mining roadway has become an important problem that restricts the efficient and safe mining of the working face. It is of great significance to reveal the
Externí odkaz:
https://doaj.org/article/6536aa1dda6b455b87bfe101808fb466
Autor:
Liu, Naming, Wang, Mingzhi, Wang, Xihuai, Zhang, Weinan, Yang, Yaodong, Zhang, Youzhi, An, Bo, Wen, Ying
The ex ante equilibrium for two-team zero-sum games, where agents within each team collaborate to compete against the opposing team, is known to be the best a team can do for coordination. Many existing works on ex ante equilibrium solutions are aimi
Externí odkaz:
http://arxiv.org/abs/2410.01575
Similarity matrix serves as a fundamental tool at the core of numerous downstream machine-learning tasks. However, missing data is inevitable and often results in an inaccurate similarity matrix. To address this issue, Similarity Matrix Completion (S
Externí odkaz:
http://arxiv.org/abs/2409.19550
Autor:
Li, Shuxin, Yang, Chang, Zhang, Youzhi, Li, Pengdeng, Wang, Xinrun, Huang, Xiao, Chan, Hau, An, Bo
Nash equilibrium (NE) is a widely adopted solution concept in game theory due to its stability property. However, we observe that the NE strategy might not always yield the best results, especially against opponents who do not adhere to NE strategies
Externí odkaz:
http://arxiv.org/abs/2408.05575
Publikováno v:
Frontiers in Materials, Vol 8 (2021)
In order to study the strength characteristics and hydration mechanism of the cemented ultra-fine tailings backfill (CUTB), the uniaxial compressive strength (UCS) tests of CUTB and cemented classified tailings backfill (CCTB) with cement-tailing rat
Externí odkaz:
https://doaj.org/article/b598b4700aa84fc99bfe51b7a1d43d78
Reinforcement Learning from Human Feedback (RLHF) has been commonly used to align the behaviors of Large Language Models (LLMs) with human preferences. Recently, a popular alternative is Direct Policy Optimization (DPO), which replaces an LLM-based r
Externí odkaz:
http://arxiv.org/abs/2405.21040
Autor:
Li, Pengdeng, Li, Shuxin, Wang, Xinrun, Cerny, Jakub, Zhang, Youzhi, McAleer, Stephen, Chan, Hau, An, Bo
Pursuit-evasion games (PEGs) model interactions between a team of pursuers and an evader in graph-based environments such as urban street networks. Recent advancements have demonstrated the effectiveness of the pre-training and fine-tuning paradigm i
Externí odkaz:
http://arxiv.org/abs/2404.12626
Two-team zero-sum games are one of the most important paradigms in game theory. In this paper, we focus on finding an unexploitable equilibrium in large team games. An unexploitable equilibrium is a worst-case policy, where members in the opponent te
Externí odkaz:
http://arxiv.org/abs/2403.00255
Offline reinforcement learning (offline RL) is an emerging field that has recently begun gaining attention across various application domains due to its ability to learn strategies from earlier collected datasets. Offline RL proved very successful, p
Externí odkaz:
http://arxiv.org/abs/2207.05285
Publikováno v:
In Sensors and Actuators: A. Physical 1 October 2024 376