Word sense disambiguation via human computation

Autor: Luis von Ahn, Jonathan Chu, Nitin Seemakurty, Anthony Tomasic
Rok vydání: 2010
Předmět:
Zdroj: Proceedings of the ACM SIGKDD Workshop on Human Computation.
Popis: One formidable problem in language technology is the word sense disambiguation (WSD) problem: disambiguating the true sense of a word as it occurs in a sentence (e.g., recognizing whether the word "bank" refers to a river bank or to a financial institution). This paper explores a strategy for harnessing the linguistic abilities of human beings to develop datasets that can be used to train machine learning algorithms for WSD. To create such datasets, we introduce a new interactive system: a fun game designed to produce valuable output by engaging human players in what they perceive to be a cooperative task of guessing the same word as another player. Our system makes a valuable contribution by tackling the knowledge acquisition bottleneck in the WSD problem domain. Rather than using conventional and costly techniques of paying lexicographers to generate training data for machine learning algorithms, we delegate the work to people who are looking to be entertained.
Databáze: OpenAIRE