Zobrazeno 1 - 10
of 637
pro vyhledávání: '"Pham, Huyen"'
We propose a comprehensive framework for policy gradient methods tailored to continuous time reinforcement learning. This is based on the connection between stochastic control problems and randomised problems, enabling applications across various cla
Externí odkaz:
http://arxiv.org/abs/2404.17939
Autor:
Jia, Wenxuan, Xu, Victoria, Kuns, Kevin, Nakano, Masayuki, Barsotti, Lisa, Evans, Matthew, Mavalvala, Nergis, Abbott, Rich, Abouelfettouh, Ibrahim, Adhikari, Rana, Ananyeva, Alena, Appert, Stephen, Arai, Koji, Aritomi, Naoki, Aston, Stuart, Ball, Matthew, Ballmer, Stefan, Barker, David, Berger, Beverly, Betzwieser, Joseph, Bhattacharjee, Dripta, Billingsley, Garilynn, Bode, Nina, Bonilla, Edgard, Bossilkov, Vladimir, Branch, Adam, Brooks, Aidan, Brown, Daniel, Bryant, John, Cahillane, Craig, Cao, Huy-tuong, Capote, Elenna, Chen, Yanbei, Clara, Filiberto, Collins, Josh, Compton, Camilla, Cottingham, Robert, Coyne, Dennis, Crouch, Ryan, Csizmazia, Janos, Cullen, Torrey, Dartez, Louis, Demos, Nicholas, Dohmen, Ezekiel, Driggers, Jenne, Dwyer, Sheila, Effler, Anamaria, Ejlli, Aldo, Etzel, Todd, Feicht, Jon, Frey, Raymond, Frischhertz, William, Fritschel, Peter, Frolov, Valery, Fulda, Paul, Fyffe, Michael, Ganapathy, Dhruva, Gateley, Bubba, Giaime, Joe, Giardina, Dwayne, Glanzer, Jane, Goetz, Evan, Jones, Aaron, Gras, Slawomir, Gray, Corey, Griffith, Don, Grote, Hartmut, Guidry, Tyler, Hall, Evan, Hanks, Jonathan, Hanson, Joe, Heintze, Matthew, Helmling-cornell, Adrian, Huang, Hsiang-yu, Inoue, Yuki, James, Alasdair, Jennings, Austin, Karat, Srinath, Kasprzack, Marie, Kawabe, Keita, Kijbunchoo, Nutsinee, Kissel, Jeffrey, Kontos, Antonios, Kumar, Rahul, Landry, Michael, Lantz, Brian, Laxen, Michael, Lee, Kyung-ha, Lesovsky, Madeline, Llamas, Francisco, Lormand, Marc, Loughlin, Hudsonalexander, Macas, Ronaldas, Macinnis, Myron, Makarem, Camille, Mannix, Benjaminrobert, Mansell, Georgia, Martin, Rodica, Maxwell, Nyath, Mccarrol, Garrett, Mccarthy, Richard, Mcclelland, David, Mccormick, Scott, Mcculler, Lee, Mcrae, Terry, Mera, Fernando, Merilh, Edmond, Meylahn, Fabian, Mittleman, Richard, Moraru, Dan, Moreno, Gerardo, Mould, Matthew, Mullavey, Adam, Nelson, Timothy, Neunzert, Ansel, Oberling, Jason, Ohanlon, Timothy, Osthelder, Charles, Ottaway, David, Overmier, Harry, Parker, William, Pele, Arnaud, Pham, Huyen, Pirello, Marc, Quetschke, Volker, Ramirez, Karla, Reyes, Jonathan, Richardson, Jonathan, Robinson, Mitchell, Rollins, Jameson, Romie, Janeen, Ross, Michael, Sadecki, Travis, Sanchez, Anthony, Sanchez, Eduardo, Sanchez, Luis, Savage, Richard, Schaetzl, Dean, Schiworski, Mitchell, Schnabel, Roman, Schofield, Robert, Schwartz, Eyal, Sellers, Danny, Shaffer, Thomas, Short, Ryan, Sigg, Daniel, Slagmolen, Bram, Soni, Siddharth, Sun, Ling, Tanner, David, Thomas, Michael, Thomas, Patrick, Thorne, Keith, Torrie, Calum, Traylor, Gary, Vajente, Gabriele, Vanosky, Jordan, Vecchio, Alberto, Veitch, Peter, Vibhute, Ajay, Vonreis, Erik, Warner, Jim, Weaver, Betsy, Weiss, Rainer, Whittle, Chris, Willke, Benno, Wipf, Christopher, Yamamoto, Hiro, Yu, Haocun, Zhang, Liyuan, Zucker, Michael
Precision measurements of space and time, like those made by the detectors of the Laser Interferometer Gravitational-wave Observatory (LIGO), are often confronted with fundamental limitations imposed by quantum mechanics. The Heisenberg uncertainty p
Externí odkaz:
http://arxiv.org/abs/2404.14569
We address a system of weakly interacting particles where the heterogenous connections among the particles are described by a graph sequence and the number of particles grows to infinity. Our results extend the existing law of large numbers and propa
Externí odkaz:
http://arxiv.org/abs/2402.08628
Autor:
Pham, Huyên, Warin, Xavier
We develop a new policy gradient and actor-critic algorithm for solving mean-field control problems within a continuous time reinforcement learning setting. Our approach leverages a gradient-based representation of the value function, employing param
Externí odkaz:
http://arxiv.org/abs/2309.04317
We study binary opinion formation in a large population where individuals are influenced by the opinions of other individuals. The population is characterised by the existence of (i) communities where individuals share some similar features, (ii) opi
Externí odkaz:
http://arxiv.org/abs/2306.16553
We propose a novel generative model for time series based on Schr{\"o}dinger bridge (SB) approach. This consists in the entropic interpolation via optimal transport between a reference probability measure on path space and a target measure consistent
Externí odkaz:
http://arxiv.org/abs/2304.05093
We study policy gradient for mean-field control in continuous time in a reinforcement learning setting. By considering randomised policies with entropy regularisation, we derive a gradient expectation representation of the value function, which is am
Externí odkaz:
http://arxiv.org/abs/2303.06993
We develop policy gradients methods for stochastic control with exit time in a model-free setting. We propose two types of algorithms for learning either directly the optimal policy or by learning alternately the value function (critic) and the optim
Externí odkaz:
http://arxiv.org/abs/2302.07320
Autor:
Pham, Huyên, Warin, Xavier
This paper is devoted to the numerical resolution of McKean-Vlasov control problems via the class of mean-field neural networks introduced in our companion paper [25] in order to learn the solution on the Wasserstein space. We propose several algorit
Externí odkaz:
http://arxiv.org/abs/2212.11518
Autor:
Pham, Huyên, Warin, Xavier
We study the machine learning task for models with operators mapping between the Wasserstein space of probability measures and a space of functions, like e.g. in mean-field games/control problems. Two classes of neural networks, based on bin density
Externí odkaz:
http://arxiv.org/abs/2210.15179