Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention

Autor: Jérôme Louradour, Theodore Bluche, Ronaldo Messina
Jazyk: angličtina
Rok vydání: 2016
Předmět:
FOS: Computer and information sciences
Closed captioning
Computer science
Arabic
Computer Vision and Pattern Recognition (cs.CV)
media_common.quotation_subject
ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION
Computer Science - Computer Vision and Pattern Recognition
02 engineering and technology
010501 environmental sciences
computer.software_genre
01 natural sciences
Reading (process)
0202 electrical engineering
electronic engineering
information engineering

Hidden Markov model
0105 earth and related environmental sciences
media_common
business.industry
Image segmentation
language.human_language
ComputingMethodologies_PATTERNRECOGNITION
Covert
Handwriting recognition
language
ComputingMethodologies_DOCUMENTANDTEXTPROCESSING
020201 artificial intelligence & image processing
Artificial intelligence
Paragraph
Transcription (software)
business
computer
Natural language processing
Zdroj: ICDAR
Popis: We present an attention-based model for end-to-end handwriting recognition. Our system does not require any segmentation of the input paragraph. The model is inspired by the differentiable attention models presented recently for speech recognition, image captioning or translation. The main difference is the implementation of covert and overt attention with a multi-dimensional LSTM network. Our principal contribution towards handwriting recognition lies in the automatic transcription without a prior segmentation into lines, which was critical in previous approaches. Moreover, the system is able to learn the reading order, enabling it to handle bidirectional scripts such as Arabic. We carried out experiments on the well-known IAM Database and report encouraging results which bring hope to perform full paragraph transcription in the near future.
Databáze: OpenAIRE