Using Machine Learning Techniques to Reduce Data Annotation Time
Autor: | Michael R. Gardner, Kari Torkkola, C. Schreiner, Keshu Zhang |
---|---|
Rok vydání: | 2006 |
Předmět: |
Engineering
Injury control business.industry Accident prevention Poison control Machine learning computer.software_genre Domain (software engineering) Random forest Medical Terminology Annotation Artificial intelligence Data mining business computer Data Annotation Medical Assisting and Transcription |
Zdroj: | Proceedings of the Human Factors and Ergonomics Society Annual Meeting. 50:2438-2442 |
ISSN: | 1071-1813 2169-5067 |
DOI: | 10.1177/154193120605002219 |
Popis: | Manually annotating large databases in any domain is costly and time-consuming. We present a semi-automatic annotation tool for this purpose that uses Random Forests as bootstrapped classifiers. We describe an application of this tool on a large database of simulated driving data. The tool enables the user to verify automatically generated annotations, rather than annotating from scratch. This tool reduced the amount of time required to annotate one minute of video by a factor of six, down to approximately thirty-five seconds of annotation time per minute of video for a database of simulated driving data. The tool is limited in that its effectiveness is dependent upon the types of data collected, and the statistical boundaries between the different annotations. |
Databáze: | OpenAIRE |
Externí odkaz: |