Advancements in Dueling Bandits

Autor:	Masrour Zoghi, Yisong Yue, Katja Hofmann, Yanan Sui
Rok vydání:	2018
Předmět:	Computer science media_common.quotation_subject 0102 computer and information sciences 02 engineering and technology 01 natural sciences Data science Preference Domain (software engineering) Action (philosophy) 010201 computation theory & mathematics 020204 information systems 0202 electrical engineering electronic engineering information engineering Key (cryptography) Quality (business) media_common
Zdroj:	IJCAI
Popis:	The dueling bandits problem is an online learning framework where learning happens ``on-the-fly'' through preference feedback, i.e., from comparisons between a pair of actions. Unlike conventional online learning settings that require absolute feedback for each action, the dueling bandits framework assumes only the presence of (noisy) binary feedback about the relative quality of each pair of actions. The dueling bandits problem is well-suited for modeling settings that elicit subjective or implicit human feedback, which is typically more reliable in preference form. In this survey, we review recent results in the theories, algorithms, and applications of the dueling bandits problem. As an emerging domain, the theories and algorithms of dueling bandits have been intensively studied during the past few years. We provide an overview of recent advancements, including algorithmic advances and applications. We discuss extensions to standard problem formulation and novel application areas, highlighting key open research questions in our discussion.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::1c9f40a4377a3f7e269dcb7f15f8a072 https://doi.org/10.24963/ijcai.2018/776 Zobrazit plný text záznamu