Separating Optical and Language Models through Encoder-Decoder Strategy for Transferable Handwriting Recognition

Autor:	Solen Quiniou, Christian Viard-Gaudin, Emmanuel Morin, Harold Mouchère, Adeline Granet
Přispěvatelé:	Image Perception Interaction (IPI), Laboratoire des Sciences du Numérique de Nantes (LS2N), Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), Université de Nantes (UN)-Université de Nantes (UN)-École Centrale de Nantes (ECN)-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique Bretagne-Pays de la Loire (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Nantes - UFR des Sciences et des Techniques (UN UFR ST), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Traitement Automatique du Langage Naturel (TALN )
Jazyk:	angličtina
Rok vydání:	2018
Předmět:	Handwriting recognition Computer science Speech recognition Feature extraction 02 engineering and technology [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing 020204 information systems 0202 electrical engineering electronic engineering information engineering Rotary encoder Artificial neural network business.industry Optical model knowledge transfer Language model Transformation (function) ComputingMethodologies_DOCUMENTANDTEXTPROCESSING 020201 artificial intelligence & image processing Artificial intelligence Transfer of learning business Word (computer architecture)
Zdroj:	16th International Conference on Frontiers in Handwriting Recognition (ICFHR) 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), Aug 2018, Niagara Falls, Canada ICFHR
Popis:	International audience; Lack of data can be an issue when beginning a new study on historical handwritten documents. To deal with this, we propose a deep-learning based recognizer which separates the optical and the language models in order to train them separately using different resources. In this work, we present the optical encoder part of a multilingual transductive transfer learning applied to historical handwriting recognition. The optical encoder transforms the input word image into a non-latent space that depends only on the letter-n-grams: it enables it to be independent of the language. This transformation avoids embedding a language model and operating the transfer learning across languages using the same alphabet. The language decoder creates from a vector of letter-n-grams a word as a sequence of characters. Experiments show that separating optical and language model can be a solution for multilingual transfer learning.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::4432305403acaf19866d3a28a7a82334 https://hal.archives-ouvertes.fr/hal-01821598/file/icfhr(2).pdf Zobrazit plný text záznamu