Multi-player multi-armed bandits: Decentralized learning with IID rewards

Autor:	Rahul Jain, Naumaan Nayyar, Dileep Kalathil
Rok vydání:	2012
Předmět:	Cognitive radio Computational complexity theory business.industry Computer science Control channel Multi-agent system Wireless Probability distribution Artificial intelligence business Game theory Zero (linguistics)
Zdroj:	Allerton Conference
Popis:	We consider the decentralized multi-armed bandit problem with distinct arms for each players. Each player can pick one arm at each time instant and can get a random reward from an unknown distribution with an unknown mean. The arms give different rewards to different players. If more than one player select the same arm, everyone gets a zero reward. There is no dedicated control channel for communication or coordination among the user. We propose an online learning algorithm called dUCB 4 which achieves a near-O(log2 T). The motivation comes from opportunistic spectrum access by multiple secondary users in cognitive radio networks wherein they must pick among various wireless channels that look different to different users.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::1881b6e5122e2b3ae909cadfea968b86 https://doi.org/10.1109/allerton.2012.6483307 Zobrazit plný text záznamu