Abstrakt: |
Over the last several years, a major factor in reducing the error rate on most speech recognition systems has been the addition of new feature components to the frame vectors. However, because of the larger dimensionality of the frame feature vector, the number of model parameters and the computational requirements have also increased. To improve recognition performance, it is not feasible to indefinitely increase the size of the frame feature vectors, nor is it satisfactory or practical to run experiments on combinatorially chosen subsets of the feature set to pick the best performing subspace with the desired number of dimensions. It becomes clearly desirable to understand which components of the frame feature vector provide the greatest contribution to recognition performance and to discard the least useful components. Our feature ordering method allows new sets of features to be selectively incorporated into existing signal analysis methods. Discriminative analysis has been successfully used for hidden Markov model (HMM) parameter estimation. In this study, we use discriminative methods to perform feature selection of the frame feature space. The components of the feature vectors are rank ordered according to an objective criterion and only the most "significant" are used for recognition. The proposed feature reduction method has been applied to a 38 dimension vector consisting of 1st and 2nd order time derivatives of the frame energy and of the cepstral coefficients with their 1st and 2nd derivatives. Speaker independent recognition experiments with reduced feature sets were performed on three databases of high quality non-telephone speech and a data base of telephone-based speech recorded during a field trial. The dimension of the frame feature vectors, and hence the number of model parameters, were greatly reduced (by a factor of 2) without a significant loss of recognition performance. Copyright 1993, 1999 Academic Press |