Using Machine Learning Techniques to Reduce Data Annotation Time

Autor: Michael R. Gardner, Kari Torkkola, C. Schreiner, Keshu Zhang
Rok vydání: 2006
Předmět:
Zdroj: Proceedings of the Human Factors and Ergonomics Society Annual Meeting. 50:2438-2442
ISSN: 1071-1813
2169-5067
DOI: 10.1177/154193120605002219
Popis: Manually annotating large databases in any domain is costly and time-consuming. We present a semi-automatic annotation tool for this purpose that uses Random Forests as bootstrapped classifiers. We describe an application of this tool on a large database of simulated driving data. The tool enables the user to verify automatically generated annotations, rather than annotating from scratch. This tool reduced the amount of time required to annotate one minute of video by a factor of six, down to approximately thirty-five seconds of annotation time per minute of video for a database of simulated driving data. The tool is limited in that its effectiveness is dependent upon the types of data collected, and the statistical boundaries between the different annotations.
Databáze: OpenAIRE