TIME-WARPING NETWORK: A NEURAL APPROACH TO HIDDEN MARKOV MODEL BASED SPEECH RECOGNITION

Autor: Enrico Bocchieri, Esther Levin, Roberto Pieraccini
Rok vydání: 1993
Předmět:
Zdroj: International Journal of Pattern Recognition and Artificial Intelligence. :783-799
ISSN: 1793-6381
0218-0014
DOI: 10.1142/s021800149300039x
Popis: Recently, much interest has been generated regarding speech recognition systems based on Hidden Markov Models (HMMs) and neural network (NN) hybrids. Such systems attempt to combine the best features of both models: the temporal structure of HMMs and the discriminative power of neural networks. In this work we establish one more relation between the HMM and the NN paradigms by introducing the time-warping network (TWN) that is a generalization of both an HMM-based recognizer and a backpropagation net. The basic element of such a network, a time- warping neuron, extends the operation of the formal neuron of a backpropagation network by warping the input pattern to match it optimally to its weights. We show that a single-layer network of TW neurons is equivalent to a Gaussian density HMM-based recognition system. This equivalent neural representation suggests ways to improve the discriminative power of this system by using backpropagation discriminative training, and/or by generalizing the structure of the recognizer to a multi-layer net. The performance of the proposed network was evaluated on a highly confusable, isolated word, multi-speaker recognition task. The results indicate that not only does the recognition performance improve, but the separation between classes is enhanced, allowing us to set up a rejection criterion to improve the confidence of the system.
Databáze: OpenAIRE