Scribe
Autor: | Jeffrey P. Bigham, Walter S. Lasecki, Raja S. Kushalnagar, Iftekhar Naim, Adam Sadilek, Daniel Gildea, Christopher D. Miller |
---|---|
Rok vydání: | 2017 |
Předmět: |
General Computer Science
Computer science business.industry media_common.quotation_subject Speech recognition 05 social sciences Latency (audio) Segment length 020207 software engineering 02 engineering and technology Perception 0202 electrical engineering electronic engineering information engineering 0501 psychology and cognitive sciences Deep integration Artificial intelligence User interface business 050107 human factors media_common |
Zdroj: | Communications of the ACM. 60:93-100 |
ISSN: | 1557-7317 0001-0782 |
DOI: | 10.1145/3068663 |
Popis: | Quickly converting speech to text allows deaf and hard of hearing people to interactively follow along with live speech. Doing so reliably requires a combination of perception, understanding, and speed that neither humans nor machines possess alone. In this article, we discuss how our Scribe system combines human labor and machine intelligence in real time to reliably convert speech to text with less than 4s latency. To achieve this speed while maintaining high accuracy, Scribe integrates automated assistance in two ways. First, its user interface directs workers to different portions of the audio stream, slows down the portion they are asked to type, and adaptively determines segment length based on typing speed. Second, it automatically merges the partial input of multiple workers into a single transcript using a custom version of multiple-sequence alignment. Scribe illustrates the broad potential for deeply interleaving human labor and machine intelligence to provide intelligent interactive services that neither can currently achieve alone. |
Databáze: | OpenAIRE |
Externí odkaz: |