Zobrazeno 1 - 10
of 10
pro vyhledávání: '"Gal Dalal"'
Publikováno v:
IEEE Transactions on Power Systems. 34:2528-2540
Outage scheduling aims at defining, over a horizon of several months to years, when different components needing maintenance should be taken out of operation. Its objective is to minimize operation-cost expectation while satisfying reliability-relate
Autor:
Chen Tessler, Yuval Shpigelman, Gal Dalal, Amit Mandelbaum, Doron Haritan Kazakov, Benjamin Fuhrer, Gal Chechik, Shie Mannor
We approach the task of network congestion control in datacenters using Reinforcement Learning (RL). Successful congestion control algorithms can dramatically improve latency and overall network throughput. Until today, no such learning-based algorit
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f8380d7cc8c18eb383a7f65ade54b1ad
Publikováno v:
AAAI
Policy evaluation in reinforcement learning is often conducted using two-timescale stochastic approximation, which results in various gradient temporal difference methods such as GTD(0), GTD2, and TDC. Here, we provide convergence rate bounds for thi
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d037fa24422a3009b85d5c53b5f49c98
http://arxiv.org/abs/1911.09157
http://arxiv.org/abs/1911.09157
Publikováno v:
AAAI 19-Thirty-Third AAAI Conference on Artificial Intelligence
AAAI 19-Thirty-Third AAAI Conference on Artificial Intelligence, Jan 2019, Honolulu, Hawai, United States
AAAI
Scopus-Elsevier
HAL
AAAI 19-Thirty-Third AAAI Conference on Artificial Intelligence, Jan 2019, Honolulu, Hawai, United States
AAAI
Scopus-Elsevier
HAL
Finite-horizon lookahead policies are abundantly used in Reinforcement Learning and demonstrate impressive empirical success. Usually, the lookahead policies are implemented with specific planning methods such as Monte Carlo Tree Search (e.g. in Alph
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::196f916985c9606299217617b4abc623
https://hal.inria.fr/hal-02273713
https://hal.inria.fr/hal-02273713
Publikováno v:
NeurIPS 2018-Thirty-second Conference on Neural Information Processing Systems
NeurIPS 2018-Thirty-second Conference on Neural Information Processing Systems, Dec 2018, Montréal, Canada
Scopus-Elsevier
HAL
NeurIPS 2018-Thirty-second Conference on Neural Information Processing Systems, Dec 2018, Montréal, Canada
Scopus-Elsevier
HAL
Multiple-step lookahead policies have demonstrated high empirical competence in Reinforcement Learning, via the use of Monte Carlo Tree Search or Model Predictive Control. In a recent work \cite{efroni2018beyond}, multiple-step greedy policies and th
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f028381a3aec07d15398adac3b208d6d
https://inria.hal.science/hal-01927962/file/approximate_online_cr_final.pdf
https://inria.hal.science/hal-01927962/file/approximate_online_cr_final.pdf
Publikováno v:
ISGT
In this work we design and compare different supervised learning algorithms to compute the cost of Alternating Current Optimal Power Flow (ACOPF). The motivation for quick calculation of OPF cost outcomes stems from the growing need of algorithmic-ba
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f3435b7c0ea1827da9d3518fe2153582
http://arxiv.org/abs/1612.06623
http://arxiv.org/abs/1612.06623
Publikováno v:
PSCC
Asset management attempts to keep the power system in working conditions. It requires much coordination between multiple entities and long term planning often months in advance. In this work we introduce a mid-term asset management formulation as a s
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1826ce1caf934a0f332343c5e2d821d8
We devise the Unit Commitment Nearest Neighbor (UCNN) algorithm to be used as a proxy for quickly approximating outcomes of short-term decisions, to make tractable hierarchical long-term assessment and planning for large power systems. Experimental r
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::62cc8413d288d64efb21b7c0565e61c2
Autor:
Gal Dalal, Shie Mannor
Publikováno v:
2015 IEEE Eindhoven PowerTech.
In this work we solve the day-ahead unit commitment (UC) problem, by formulating it as a Markov decision process (MDP) and finding a low-cost policy for generation scheduling. We present two reinforcement learning algorithms, and devise a third one.
Publikováno v:
HAL
EWRL 2018-14th European workshop on Reinforcement Learning
EWRL 2018-14th European workshop on Reinforcement Learning, Oct 2018, Lille, France
EWRL 2018-14th European workshop on Reinforcement Learning
EWRL 2018-14th European workshop on Reinforcement Learning, Oct 2018, Lille, France
International audience; Anderson (1965) acceleration is an old and simple method for accelerating the computation of a fixed point. However, as far as we know and quite surprisingly, it has never been applied to dynamic programming or reinforcement l
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::be5e4a4d6112bcc463e899b0fe963490
https://hal.inria.fr/hal-01927977
https://hal.inria.fr/hal-01927977