Zobrazeno 1 - 10
of 28 972
pro vyhledávání: '"Hong, ZHANG"'
Autor:
Hwang, Jaedong, Cheung, Brian, Hong, Zhang-Wei, Boopathy, Akhilan, Agrawal, Pulkit, Fiete, Ila
Highly performant large-scale pre-trained models promise to also provide a valuable foundation for learning specialized tasks, by fine-tuning the model to the desired task. By starting from a good general-purpose model, the goal is to achieve both sp
Externí odkaz:
http://arxiv.org/abs/2410.21582
Reward shaping is a critical component in reinforcement learning (RL), particularly for complex tasks where sparse rewards can hinder learning. While shaping rewards have been introduced to provide additional guidance, selecting effective shaping fun
Externí odkaz:
http://arxiv.org/abs/2410.13837
The ability to efficiently explore high-dimensional state spaces is essential for the practical success of deep Reinforcement Learning (RL). This paper introduces a new exploration technique called Random Latent Exploration (RLE), that combines the s
Externí odkaz:
http://arxiv.org/abs/2407.13755
Publikováno v:
Reinforcement Learning Journal, vol. 4, 2024, pp. 1598-1618
Experience replay serves as a key component in the success of online reinforcement learning (RL). Prioritized experience replay (PER) reweights experiences by the temporal difference (TD) error empirically enhancing the performance. However, few work
Externí odkaz:
http://arxiv.org/abs/2407.03995
Generating varied scenarios through simulation is crucial for training and evaluating safety-critical systems, such as autonomous vehicles. Yet, the task of modeling the trajectories of other vehicles to simulate diverse and meaningful close interact
Externí odkaz:
http://arxiv.org/abs/2406.04300
Autor:
Hong, Zhang-Wei, Shenfeld, Idan, Wang, Tsun-Hsuan, Chuang, Yung-Sung, Pareja, Aldo, Glass, James, Srivastava, Akash, Agrawal, Pulkit
Large language models (LLMs) hold great potential for many natural language applications but risk generating incorrect or toxic content. To probe when an LLM generates unwanted content, the current paradigm is to recruit a \textit{red team} of human
Externí odkaz:
http://arxiv.org/abs/2402.19464
Depth completion is a long-standing challenge in computer vision, where classification-based methods have made tremendous progress in recent years. However, most existing classification-based methods rely on pre-defined pixel-shared and discrete dept
Externí odkaz:
http://arxiv.org/abs/2402.13579
Deep reinforcement learning methods exhibit impressive performance on a range of tasks but still struggle on hard exploration tasks in large environments with sparse rewards. To address this, intrinsic rewards can be generated using forward model pre
Externí odkaz:
http://arxiv.org/abs/2310.17537
Autor:
Hong, Zhang-Wei, Kumar, Aviral, Karnik, Sathwik, Bhandwaldar, Abhishek, Srivastava, Akash, Pajarinen, Joni, Laroche, Romain, Gupta, Abhishek, Agrawal, Pulkit
Publikováno v:
NeurIPS 2023
Offline policy learning is aimed at learning decision-making policies using existing datasets of trajectories without collecting additional data. The primary motivation for using reinforcement learning (RL) instead of supervised learning techniques s
Externí odkaz:
http://arxiv.org/abs/2310.04413
Publikováno v:
Scientific Reports, Vol 14, Iss 1, Pp 1-16 (2024)
Abstract In the era of the global knowledge economy, the rapid advancement of new-generation information technologies has positioned digital industry at the forefront of socio-economic development. Digital industrial clusters, comprising key enterpri
Externí odkaz:
https://doaj.org/article/77155bff38ef47f58d14beccdfd3fdbe