Robust Harmonic Features for Classification-Based Pitch Estimation
Autor: | John H. L. Hansen, Chengzhu Yu, Dongmei Wang |
---|---|
Rok vydání: | 2017 |
Předmět: |
Acoustics and Ultrasonics
Computer science Speech recognition Audio time-scale/pitch modification 02 engineering and technology Viterbi algorithm Article 030507 speech-language pathology & audiology 03 medical and health sciences symbols.namesake 0202 electrical engineering electronic engineering information engineering Computer Science (miscellaneous) Electrical and Electronic Engineering Hidden Markov model Pitch contour business.industry 020206 networking & telecommunications Pitch detection algorithm Pattern recognition Fundamental frequency Speech processing Computational Mathematics ComputingMethodologies_PATTERNRECOGNITION Computer Science::Sound Spectral envelope symbols Artificial intelligence 0305 other medical science business |
Zdroj: | IEEE/ACM Transactions on Audio, Speech, and Language Processing. 25:952-964 |
ISSN: | 2329-9304 2329-9290 |
DOI: | 10.1109/taslp.2017.2667879 |
Popis: | Pitch estimation in diverse naturalistic audio streams remains a challenge for speech processing and spoken language technology. In this study, we investigate the use of robust harmonic features for classification-based pitch estimation. The proposed pitch estimation algorithm is composed of two stages: pitch candidate generation and target pitch selection. Based on energy intensity and spectral envelope shape, five types of robust harmonic features are proposed to reflect pitch associated harmonic structure. A neural network is adopted for modeling the relationship between input harmonic features and output pitch salience for each specific pitch candidate. In the test stage, each pitch candidate is assessed with an output salience that indicates the potential as a true pitch value, based on its input feature vector processed through the neural network. Finally, according to the temporal continuity of pitch values, pitch contour tracking is performed using a hidden Markov model (HMM), and the Viterbi algorithm is used for HMM decoding. Experimental results show that the proposed algorithm outperforms several state-of-the-art pitch estimation methods in terms of accuracy in both high and low levels of additive noise. |
Databáze: | OpenAIRE |
Externí odkaz: |