Assisted Transcription of Historical Documents by Keyword Spotting: A Performance Model

Autor: Angelo Marcelli, Adolfo Santoro, Claudio De Stefano
Jazyk: angličtina
Rok vydání: 2018
Předmět:
Zdroj: ICDAR
Popis: We propose a model for estimating the time to transcribe a large collection of historical handwritten documents when the transcription is assisted by a keyword spotting system following the query-by-string approach. The model assumes that the system is segmentation-based and provides as output the transcription of each item (either right or wrong) or a reject. We also assume that any other information the system may need is obtained from the training set. The model has been validated by comparing its estimates with the actual time required for the manual transcription of pages from the Bentham dataset. Eventually, we discuss possible ways of extending the model to consider different kind of keyword spotting system, such as those providing the output in terms of a ranked list of alternatives and/or adopting the query-by-example approach.
Databáze: OpenAIRE