Help Me to Help You
Autor: | Michael Laraia, Chris Lintott, Lucy Fortson, Mike Walmsley, Darryl Wright |
---|---|
Rok vydání: | 2019 |
Předmět: |
0303 health sciences
Computer science business.industry Deep learning General Engineering Crowdsourcing 01 natural sciences Data science Grouped data 03 medical and health sciences 0103 physical sciences Citizen science Leverage (statistics) Unsupervised learning Artificial intelligence Cluster analysis business 010303 astronomy & astrophysics MNIST database 030304 developmental biology |
Zdroj: | ACM Transactions on Social Computing. 2:1-20 |
ISSN: | 2469-7826 2469-7818 |
DOI: | 10.1145/3362741 |
Popis: | The increasing size of datasets with which researchers in a variety of domains are confronted has led to a range of creative responses, including the deployment of modern machine learning techniques and the advent of large scale “citizen science projects.” However, the ability of the latter to provide suitably large training sets for the former is stretched as the size of the problem (and competition for attention amongst projects) grows. We explore the application of unsupervised learning to leverage structure that exists in an initially unlabelled dataset. We simulate grouping similar points before presenting those groups to volunteers to label. Citizen science labelling of grouped data is more efficient, and the gathered labels can be used to improve efficiency further for labelling future data. To demonstrate these ideas, we perform experiments using data from the Pan-STARRS Survey for Transients (PSST) with volunteer labels gathered by the Zooniverse project, Supernova Hunters and a simulated project using the MNIST handwritten digit dataset. Our results show that, in the best case, we might expect to reduce the required volunteer effort by 87.0% and 92.8% for the two datasets, respectively. These results illustrate a symbiotic relationship between machine learning and citizen scientists where each empowers the other with important implications for the design of citizen science projects in the future. |
Databáze: | OpenAIRE |
Externí odkaz: |