Human Belief State-Based Exploration and Exploitation in an Information-Selective Symmetric Reversal Bandit Task
Autor: | Dirk Ostwald, Michael P. Milham, Shruti Ray, Philipp Schwartenbeck, Lilla Horvath, Stanley J. Colcombe |
---|---|
Rok vydání: | 2021 |
Předmět: |
Computer science
Bayesian probability Control (management) Exploratory research Exploitation Machine learning computer.software_genre Task (project management) Developmental and Educational Psychology Agent-based behavioral modeling Set (psychology) 100 Philosophie und Psychologie::150 Psychologie::150 Psychologie Structure (mathematical logic) Original Paper business.industry Heuristic Perspective (graphical) Probabilistic logic Personal effectiveness Bandit problem Neuropsychology and Physiological Psychology Artificial intelligence Exploration business computer Cognitive psychology |
Zdroj: | Computational Brain & Behavior |
DOI: | 10.17169/refubium-38367 |
Popis: | Humans often face sequential decision-making problems, in which information about the environmental reward structure is detached from rewards for a subset of actions. For example, a medicated patient may consider partaking in a clinical trial on the effectiveness of a new drug. Taking part in the trial can provide the patient with information about the personal effectiveness of the new drug and the potential reward of a better treatment. Not taking part in the trial does not provide the patient with this information, but is associated with the reward of a (potentially less) effective treatment. In the current study, we introduce a novel information-selective reversal bandit task to model such situations and obtained choice data on this task from 24 participants. To arbitrate between different decision-making strategies that participants may use on this task, we developed a set of probabilistic agent-based behavioural models, including exploitative and explorative Bayesian agents, as well as heuristic control agents. Upon validating the model and parameter recovery properties of our model set and summarizing the participants’ choice data in a descriptive way, we used a maximum likelihood approach to evaluate the participants’ choice data from the perspective of our model set. In brief, we provide evidence that participants employ a belief state-based hybrid explorative-exploitative strategy on the information-selective reversal bandit task, lending further support to the finding that humans are guided by their subjective uncertainty when solving exploration-exploitation dilemmas. |
Databáze: | OpenAIRE |
Externí odkaz: |