Zobrazeno 1 - 10
of 32
pro vyhledávání: '"Lee, Jongyeong"'
This paper studies the optimality of the Follow-the-Perturbed-Leader (FTPL) policy in both adversarial and stochastic $K$-armed bandits. Despite the widespread use of the Follow-the-Regularized-Leader (FTRL) framework with various choices of regulari
Externí odkaz:
http://arxiv.org/abs/2403.05134
This paper studies the fixed-confidence best arm identification (BAI) problem in the bandit framework in the canonical single-parameter exponential models. For this problem, many policies have been proposed, but most of them require solving an optimi
Externí odkaz:
http://arxiv.org/abs/2310.00539
Thompson sampling (TS) has been known for its outstanding empirical performance supported by theoretical guarantees across various reward models in the classical stochastic multi-armed bandit problems. Nonetheless, its optimality is often restricted
Externí odkaz:
http://arxiv.org/abs/2302.14407
In the stochastic multi-armed bandit problem, a randomized probability matching policy called Thompson sampling (TS) has shown excellent performance in various reward models. In addition to the empirical performance, TS has been shown to achieve asym
Externí odkaz:
http://arxiv.org/abs/2302.01544
When minimizing the empirical risk in binary classification, it is a common practice to replace the zero-one loss with a surrogate loss to make the learning objective feasible to optimize. Examples of well-known surrogate losses for binary classifica
Externí odkaz:
http://arxiv.org/abs/2101.01366
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
We consider a document classification problem where document labels are absent but only relevant keywords of a target class and unlabeled documents are given. Although heuristic methods based on pseudo-labeling have been considered, theoretical under
Externí odkaz:
http://arxiv.org/abs/1910.04385
Appropriately evaluating the discrepancy between domains is essential for the success of unsupervised domain adaptation. In this paper, we first point out that existing discrepancy measures are less informative when complex models such as deep neural
Externí odkaz:
http://arxiv.org/abs/1901.10654
This paper aims to provide a better understanding of a symmetric loss. First, we emphasize that using a symmetric loss is advantageous in the balanced error rate (BER) minimization and area under the receiver operating characteristic curve (AUC) maxi
Externí odkaz:
http://arxiv.org/abs/1901.09314
Thompson sampling (TS) for the parametric stochastic multi-armed bandits has been well studied under the one-dimensional parametric models. It is often reported that TS is fairly insensitive to the choice of the prior when it comes to regret bounds.
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9afd710ec45b69d28395ae51b0fdd445