Zobrazeno 1 - 10
of 264
pro vyhledávání: '"Mohamad, Kazem"'
We study the problem of system identification for stochastic continuous-time dynamics, based on a single finite-length state trajectory. We present a method for estimating the possibly unstable open-loop matrix by employing properly randomized contro
Externí odkaz:
http://arxiv.org/abs/2409.11327
Contextual bandits constitute a classical framework for decision-making under uncertainty. In this setting, the goal is to learn the arms of highest reward subject to contextual information, while the unknown reward parameters of each arm need to be
Externí odkaz:
http://arxiv.org/abs/2402.10289
Publikováno v:
Scientific Reports, Vol 14, Iss 1, Pp 1-7 (2024)
Abstract Helicobacter pylori (H. pylori) is responsible for various chronic or acute diseases, such as stomach ulcers, dyspepsia, peptic ulcers, gastroesophageal reflux, gastritis, lymphoma, and stomach cancers. Although specific drugs are available
Externí odkaz:
https://doaj.org/article/8dc794f002d7491dafe4c0254f55dddc
Diffusion processes that evolve according to linear stochastic differential equations are an important family of continuous-time dynamic decision-making models. Optimal policies are well-studied for them, under full certainty about the drift matrices
Externí odkaz:
http://arxiv.org/abs/2206.09977
This work theoretically studies a ubiquitous reinforcement learning policy for controlling the canonical model of continuous-time stochastic linear-quadratic systems. We show that randomized certainty equivalent policy addresses the exploration-explo
Externí odkaz:
http://arxiv.org/abs/2206.04434
Contextual bandits are canonical models for sequential decision-making under uncertainty in environments with time-varying components. In this setting, the expected reward of each bandit arm consists of the inner product of an unknown parameter with
Externí odkaz:
http://arxiv.org/abs/2204.04773
Publikováno v:
In Cretaceous Research February 2025 166
Autor:
Momeni, Mohamad Kazem, Taghipour, Hassan, Ghayebzadeh, Mehdi, Mohammadi, Mahdi, Keikhaee, Razieh
Publikováno v:
In Environmental Pollution 15 January 2025 365
Contextual bandits are widely-used in the study of learning-based control policies for finite action spaces. While the problem is well-studied for bandits with perfectly observed context vectors, little is known about the case of imperfectly observed
Externí odkaz:
http://arxiv.org/abs/2202.00867
Learning-based control of linear systems received a lot of attentions recently. In popular settings, the true dynamical models are unknown to the decision-maker and need to be interactively learned by applying control inputs to the systems. Unlike th
Externí odkaz:
http://arxiv.org/abs/2201.01387