Modeling intra-label dynamics in connectionist temporal classification

Autor: Ahad Harati, Kamaledin Ghiasi-Shirazi, Ashkan Sadeghi Lotfabadi
Rok vydání: 2017
Předmět:
Zdroj: 2017 7th International Conference on Computer and Knowledge Engineering (ICCKE).
DOI: 10.1109/iccke.2017.8167906
Popis: Most sequence processing tasks can be cast as a problem of mapping a sequence of observations into a sequence of labels. This is a very difficult problem since the association between input data sequences and output label sequences is not given at the frame level. Recurrent neural networks (RNNs) equipped with connectionist temporal classification (CTC) are among the best tools devised to handle this problem and have been used to achieve state of the art results in many handwritten and speech recognition tasks. The reason that RNNs are used instead of feedforward networks in combination with CTC is that CTC does not model the dynamics of sequences. Specifically, the long short term memory (LSTM) RNN, which is excellent at memorizing information for a long time, is used in combination with CTC to overcome the limitations of CTC in modeling the dynamics of sequences. In this paper, we propose to model each label with a sequence of hidden sub-labels at the CTC level. The proposed framework allows CTC to learn the intra-label relations which transfers part of the load of learning dynamical sequences from RNN to CTC. Our experiments on handwriting recognition tasks show that the proposed method outperforms standard CTC in terms of accuracy.
Databáze: OpenAIRE