Separating Optical and Language Models through Encoder-Decoder Strategy for Transferable Handwriting Recognition

Autor: Solen Quiniou, Christian Viard-Gaudin, Emmanuel Morin, Harold Mouchère, Adeline Granet
Přispěvatelé: Image Perception Interaction (IPI), Laboratoire des Sciences du Numérique de Nantes (LS2N), Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), Université de Nantes (UN)-Université de Nantes (UN)-École Centrale de Nantes (ECN)-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Traitement Automatique du Langage Naturel (TALN )
Jazyk: angličtina
Rok vydání: 2018
Předmět:
Handwriting recognition
Computer science
Speech recognition
Feature extraction
02 engineering and technology
[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE]
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing
020204 information systems
0202 electrical engineering
electronic engineering
information engineering

Rotary encoder
Artificial neural network
business.industry
Optical model
knowledge transfer
Language model
Transformation (function)
ComputingMethodologies_DOCUMENTANDTEXTPROCESSING
020201 artificial intelligence & image processing
Artificial intelligence
Transfer of learning
business
Word (computer architecture)
Zdroj: 16th International Conference on Frontiers in Handwriting Recognition (ICFHR)
16th International Conference on Frontiers in Handwriting Recognition (ICFHR), Aug 2018, Niagara Falls, Canada
ICFHR
Popis: International audience; Lack of data can be an issue when beginning a new study on historical handwritten documents. To deal with this, we propose a deep-learning based recognizer which separates the optical and the language models in order to train them separately using different resources. In this work, we present the optical encoder part of a multilingual transductive transfer learning applied to historical handwriting recognition. The optical encoder transforms the input word image into a non-latent space that depends only on the letter-n-grams: it enables it to be independent of the language. This transformation avoids embedding a language model and operating the transfer learning across languages using the same alphabet. The language decoder creates from a vector of letter-n-grams a word as a sequence of characters. Experiments show that separating optical and language model can be a solution for multilingual transfer learning.
Databáze: OpenAIRE