Zobrazeno 1 - 10
of 3 680
pro vyhledávání: '"Suttle A"'
Autor:
Singh, Utsav, Chakraborty, Souradip, Suttle, Wesley A., Sadler, Brian M., Sahu, Anit Kumar, Shah, Mubarak, Namboodiri, Vinay P., Bedi, Amrit Singh
This work introduces Hierarchical Preference Optimization (HPO), a novel approach to hierarchical reinforcement learning (HRL) that addresses non-stationarity and infeasible subgoal generation issues when solving complex robotic control tasks. HPO le
Externí odkaz:
http://arxiv.org/abs/2411.00361
Autor:
Patel, Bhrij, Chakraborty, Souradip, Suttle, Wesley A., Wang, Mengdi, Bedi, Amrit Singh, Manocha, Dinesh
Text-based AI system optimization typically involves a feedback loop scheme where a single LLM generates an evaluation in natural language of the current output to improve the next iteration's output. However, in this work, we empirically demonstrate
Externí odkaz:
http://arxiv.org/abs/2410.03131
Autor:
Singh, Utsav, Chakraborty, Souradip, Suttle, Wesley A., Sadler, Brian M., Namboodiri, Vinay P, Bedi, Amrit Singh
Learning control policies to perform complex robotics tasks from human preference data presents significant challenges. On the one hand, the complexity of such tasks typically requires learning policies to perform a variety of subtasks, then combinin
Externí odkaz:
http://arxiv.org/abs/2406.10892
In this work, we introduce PIPER: Primitive-Informed Preference-based Hierarchical reinforcement learning via Hindsight Relabeling, a novel approach that leverages preference-based learning to learn a reward model, and subsequently uses this reward m
Externí odkaz:
http://arxiv.org/abs/2404.13423
Autor:
Valenzuela-Villaseca, V., Suttle, L. G., Suzuki-Vidal, F., Halliday, J. W. D., Russell, D. R., Merlini, S., Tubman, E. R., Hare, J. D., Chittenden, J. P., Koepke, M. E., Blackman, E. G., Lebedev, S. V.
We present a detailed characterization of the structure and evolution of differentially rotating plasmas driven on the MAGPIE pulsed-power generator (1.4 MA peak current, 240 ns rise-time). The experiments were designed to simulate physics relevant t
Externí odkaz:
http://arxiv.org/abs/2403.20321
Autor:
Patel, Bhrij, Suttle, Wesley A., Koppel, Alec, Aggarwal, Vaneet, Sadler, Brian M., Bedi, Amrit Singh, Manocha, Dinesh
In the context of average-reward reinforcement learning, the requirement for oracle knowledge of the mixing time, a measure of the duration a Markov chain under a fixed policy needs to achieve its stationary distribution, poses a significant challeng
Externí odkaz:
http://arxiv.org/abs/2403.11925
Autor:
Suttle, Wesley A., Sharma, Vipul K., Kosaraju, Krishna C., Sivaranjani, S., Liu, Ji, Gupta, Vijay, Sadler, Brian M.
We develop provably safe and convergent reinforcement learning (RL) algorithms for control of nonlinear dynamical systems, bridging the gap between the hard safety guarantees of control theory and the convergence guarantees of RL theory. Recent advan
Externí odkaz:
http://arxiv.org/abs/2403.04007
Autor:
Walton, Craig R., Rigley, Jessica K., Lipp, Alexander, Law, Robert, Suttle, Martin D., Schonbachler, Maria, Wyatt, Mark, Shorttle, Oliver
Publikováno v:
Earth. Nat Astron (2024)
Earth's surface is deficient in available forms of many elements considered limiting for prebiotic chemistry. In contrast, many extraterrestrial rocky objects are rich in these same elements. Limiting prebiotic ingredients may, therefore, have been d
Externí odkaz:
http://arxiv.org/abs/2402.12310
Deceptive path planning (DPP) is the problem of designing a path that hides its true goal from an outside observer. Existing methods for DPP rely on unrealistic assumptions, such as global state observability and perfect model knowledge, and are typi
Externí odkaz:
http://arxiv.org/abs/2402.06552
Autor:
Valenzuela-Villaseca, V., Suttle, L. G., Suzuki-Vidal, F., Halliday, J. W. D., Russell, D. R., Merlini, S., Tubman, E. R., Hare, J. D., Chittenden, J. P., Koepke, M. E., Blackman, E. G., Lebedev, S. V.
Recent pulsed-power experiments have demonstrated the formation of astrophysically-relevant, differentially rotating plasmas [1]. Key features of the plasma flows are the discovery of a quasi-Keplerian rotation curve, the launching of highly-collimat
Externí odkaz:
http://arxiv.org/abs/2312.02346