Zobrazeno 1 - 10
of 1 944
pro vyhledávání: '"WINNICKI A."'
Standard reinforcement learning (RL) assumes that an agent can observe a reward for each state-action pair. However, in practical applications, it is often difficult and costly to collect a reward for each state-action pair. While there have been sev
Externí odkaz:
http://arxiv.org/abs/2502.01876
We demonstrate production of cold atomic strontium (Sr) and strontium-containing molecules (SrOH) in a cryogenic buffer gas beam source via direct heating of strontium oxide (SrO) with 30 mJ laser pulses several milliseconds long. $3.7(2)\times10^{14
Externí odkaz:
http://arxiv.org/abs/2407.09907
Reinforcement Learning from Human Feedback (RLHF) has achieved impressive empirical successes while relying on a small amount of human feedback. However, there is limited theoretical justification for this phenomenon. Additionally, most recent studie
Externí odkaz:
http://arxiv.org/abs/2402.10342
Autor:
Szczecina M., Winnicki A.
Publikováno v:
Archives of Civil Engineering, Vol 62, Iss 1, Pp 51-64 (2016)
The paper presents some important aspects concerning material constants of concrete and stages of modeling of reinforced concrete structures. The problems taken into account are: a choice of proper material model for concrete, establishing of compres
Externí odkaz:
https://doaj.org/article/205b8e831ada49f0a632c0d4bb4e2f53
Autor:
Winnicki, Anna, Srikant, R.
Optimal policies in standard MDPs can be obtained using either value iteration or policy iteration. However, in the case of zero-sum Markov games, there is no efficient policy iteration algorithm; e.g., it has been shown that one has to solve Omega(1
Externí odkaz:
http://arxiv.org/abs/2303.09716
Publikováno v:
Reports on Geodesy and Geoinformatics, Vol 118, Iss 1 (2024)
The research presented in this paper concerns the determination of the attraction basins of Newton’s iterative method, which was used to solve the non-linear systems of observational equations associated with the geodetic measurements. The simple o
Externí odkaz:
https://doaj.org/article/c2326c57524d4ca8a24c842baaaf2912
Autor:
Winnicki, Anna, Srikant, R.
A common technique in reinforcement learning is to evaluate the value function from Monte Carlo simulations of a given policy, and use the estimated value function to obtain a new policy which is greedy with respect to the estimated value function. A
Externí odkaz:
http://arxiv.org/abs/2301.09709
Autor:
Anna Niezgoda, Andrzej Winnicki, Jerzy Krysiński, Piotr Niezgoda, Laura Nowowiejska, Rafał Czajkowski
Publikováno v:
Scientific Reports, Vol 14, Iss 1, Pp 1-14 (2024)
Abstract Contemporary treatment of vitiligo remains a great challenge to practitioners. The vast majority of currently conducted clinical trials of modern therapeutic methods are focused on systemic medications, while there is only a very limited num
Externí odkaz:
https://doaj.org/article/4a9e1d3ab5584b389ba83ad9e43cadfd
Publikováno v:
Knowledge and Management of Aquatic Ecosystems, Vol 0, Iss 376-377, Pp 787-793 (2005)
The potential of artificial hideouts, outfitted with magnets or their imitations (control), to attract spinycheek crayfish Orconectes limosus was studied. The experiments were carried out within 1999-2002 in an 80-hectar natural lake. There were thre
Externí odkaz:
https://doaj.org/article/07850dce5b5244ba8b33935546d5921d
Autor:
Winnicki, Anna, Srikant, R.
We provide performance guarantees for a variant of simulation-based policy iteration for controlling Markov decision processes that involves the use of stochastic approximation algorithms along with state-of-the-art techniques that are useful for ver
Externí odkaz:
http://arxiv.org/abs/2210.07338