A formalized framework for incorporating expert labels in crowdsourcing environment
Autor: | Qingyang Hu, Kevin Chiew, Zhenguang Liu, Qinming He, Hao Huang |
---|---|
Rok vydání: | 2015 |
Předmět: |
Computer Networks and Communications
Computer science Feature vector 02 engineering and technology 010501 environmental sciences Crowdsourcing Machine learning computer.software_genre 01 natural sciences Naive Bayes classifier Crowds Artificial Intelligence 0202 electrical engineering electronic engineering information engineering 0105 earth and related environmental sciences Ground truth business.industry Supervised learning ComputingMethodologies_PATTERNRECOGNITION Hardware and Architecture Labeled data 020201 artificial intelligence & image processing Artificial intelligence business computer Classifier (UML) Software Information Systems |
Zdroj: | Journal of Intelligent Information Systems. 47:403-425 |
ISSN: | 1573-7675 0925-9902 |
DOI: | 10.1007/s10844-015-0371-6 |
Popis: | Crowdsourcing services have been proven efficient in collecting large amount of labeled data for supervised learning tasks. However, the low cost of crowd workers leads to unreliable labels, a new problem for learning a reliable classifier. Various methods have been proposed to infer the ground truth or learn from crowd data directly though, there is no guarantee that these methods work well for highly biased or noisy crowd labels. Motivated by this limitation of crowd data, in this paper, we propose a novel framewor for improving the performance of crowdsourcing learning tasks by some additional expert labels, that is, we treat each labeler as a personal classifier and combine all labelers' opinions from a model combination perspective, and summarize the evidence from crowds and experts naturally via a Bayesian classifier in the intermediate feature space formed by personal classifiers. We also introduce active learning to our framework and propose an uncertainty sampling algorithm for actively obtaining expert labels. Experiments show that our method can significantly improve the learning quality as compared with those methods solely using crowd labels. |
Databáze: | OpenAIRE |
Externí odkaz: |