Comparative Study of HMM and BLSTM Segmentation-Free Approaches for the Recognition of Handwritten Text-Lines

Autor:	Laurence Likforman-Sulem, Olivier Morillot, Emmanuèle Grosicki
Přispěvatelé:	Signal, Statistique et Apprentissage (S2A), Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Département Traitement du Signal et des Images (TSI), Télécom ParisTech-Centre National de la Recherche Scientifique (CNRS), Télécom ParisTech-Institut Mines-Télécom [Paris] (IMT)-Centre National de la Recherche Scientifique (CNRS), HAL, TelecomParis
Rok vydání:	2013
Předmět:	[IINFO.INFO-TT]domain_iinfo/domain_iinfo.info-tt [INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing Computer science business.industry [IINFO.INFO-TT] domain_iinfo/domain_iinfo.info-tt Speech recognition Feature extraction Pattern recognition 02 engineering and technology Image segmentation 03 medical and health sciences 0302 clinical medicine Recurrent neural network [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing Handwriting recognition Sliding window protocol 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Segmentation Language model Artificial intelligence Hidden Markov model business ComputingMilieux_MISCELLANEOUS 030217 neurology & neurosurgery
Zdroj:	ICDAR ICDAR 2013 ICDAR 2013, Aug 2013, Washington, United States
DOI:	10.1109/icdar.2013.160
Popis:	This paper deals with the recognition of free-style handwritten text lines. We compare 2 state-of-the-art segmentation-free recognition approaches. The first one is the popular context-dependent HMM approach (Hidden Markov Models). The second one is the recent BLSTM (Bi-directional Long Short-Term Memory) approach based on recurrent neural networks and memory blocks. For the sake of comparison, both recognizers use the same set of features and language model. They are compared from the following perspectives: sliding window parameters for feature extraction, training and decoding speed and performance accuracy with or without using a language model. We compare these two approaches on the publicly available Rimes database of French handwritten mails. Our main findings are that long frame sequences, obtained with specific window parameters, improve both recognizers, and that BLSTMs outperform HMMs in terms of WER rates, at the expense of considerably longer training times.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::20b9c1479af10dd841d0dd6d7d1fda5b https://doi.org/10.1109/icdar.2013.160 Zobrazit plný text záznamu