Zobrazeno 1 - 7
of 7
pro vyhledávání: '"Masoudian, Saeed"'
We propose a new best-of-both-worlds algorithm for bandits with variably delayed feedback. In contrast to prior work, which required prior knowledge of the maximal delay $d_{\mathrm{max}}$ and had a linear dependence of the regret on it, our algorith
Externí odkaz:
http://arxiv.org/abs/2308.10675
Autor:
Esposito, Emmanuel, Masoudian, Saeed, Qiu, Hao, van der Hoeven, Dirk, Cesa-Bianchi, Nicolò, Seldin, Yevgeny
We study a $K$-armed bandit with delayed feedback and intermediate observations. We consider a model where intermediate observations have a form of a finite state, which is observed immediately after taking an action, whereas the loss is observed aft
Externí odkaz:
http://arxiv.org/abs/2305.19036
We present a modified tuning of the algorithm of Zimmert and Seldin [2020] for adversarial multiarmed bandits with delayed feedback, which in addition to the minimax optimal adversarial regret guarantee shown by Zimmert and Seldin simultaneously achi
Externí odkaz:
http://arxiv.org/abs/2206.14906
Autor:
Masoudian, Saeed, Seldin, Yevgeny
Publikováno v:
Conference on Learning Theory 134 (2021) 3330-3350
We derive improved regret bounds for the Tsallis-INF algorithm of Zimmert and Seldin (2021). We show that in adversarial regimes with a $(\Delta,C,T)$ self-bounding constraint the algorithm achieves $\mathcal{O}\left(\left(\sum_{i\neq i^*} \frac{1}{\
Externí odkaz:
http://arxiv.org/abs/2103.12487
Autor:
Modegh, Rassa Ghavami, Hamidi, Mehrab, Masoudian, Saeed, Mohseni, Amir, Lotfalinezhad, Hamzeh, Kazemi, Mohammad Ali, Moradi, Behnaz, Ghafoori, Mahyar, Motamedi, Omid, Pournik, Omid, Rezaei-Kalantari, Kiara, Manteghinezhad, Amirreza, Javanmard, Shaghayegh Haghjooy, Nezhad, Fateme Abdoli, Enhesari, Ahmad, Kheyrkhah, Mohammad Saeed, Eghtesadi, Razieh, Azadbakht, Javid, Aliasgharzadeh, Akbar, Sharif, Mohammad Reza, Khaleghi, Ali, Foroutan, Abbas, Ghanaati, Hossein, Dashti, Hamed, Rabiee, Hamid R.
COVID-19 is a virus with high transmission rate that demands rapid identification of the infected patients to reduce the spread of the disease. The current gold-standard test, Reverse-Transcription Polymerase Chain Reaction (RT-PCR), has a high rate
Externí odkaz:
http://arxiv.org/abs/2011.11736
As application demands for online convex optimization accelerate, the need for designing new methods that simultaneously cover a large class of convex functions and impose the lowest possible regret is highly rising. Known online optimization methods
Externí odkaz:
http://arxiv.org/abs/1906.00290
B\"uchi automaton of records (BAR) has been proposed as a basic operational semantics for Reo coordination language. It is an extension of B\"uchi automaton by using a set of records as its alphabet or transition labels. Records are used to express t
Externí odkaz:
http://arxiv.org/abs/1511.05070