Scribe

Autor: Jeffrey P. Bigham, Walter S. Lasecki, Raja S. Kushalnagar, Iftekhar Naim, Adam Sadilek, Daniel Gildea, Christopher D. Miller
Rok vydání: 2017
Předmět:
Zdroj: Communications of the ACM. 60:93-100
ISSN: 1557-7317
0001-0782
DOI: 10.1145/3068663
Popis: Quickly converting speech to text allows deaf and hard of hearing people to interactively follow along with live speech. Doing so reliably requires a combination of perception, understanding, and speed that neither humans nor machines possess alone. In this article, we discuss how our Scribe system combines human labor and machine intelligence in real time to reliably convert speech to text with less than 4s latency. To achieve this speed while maintaining high accuracy, Scribe integrates automated assistance in two ways. First, its user interface directs workers to different portions of the audio stream, slows down the portion they are asked to type, and adaptively determines segment length based on typing speed. Second, it automatically merges the partial input of multiple workers into a single transcript using a custom version of multiple-sequence alignment. Scribe illustrates the broad potential for deeply interleaving human labor and machine intelligence to provide intelligent interactive services that neither can currently achieve alone.
Databáze: OpenAIRE