Detection of transitions between broad phonetic classes in a speech signal
Autor: | Ananthapadmanabha, T V, Girish, K V Vijay, Ramakrishnan, A G |
---|---|
Rok vydání: | 2014 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | Detection of transitions between broad phonetic classes in a speech signal is an important problem which has applications such as landmark detection and segmentation. The proposed hierarchical method detects silence to non-silence transitions, high amplitude (mostly sonorants) to low ampli- tude (mostly fricatives/affricates/stop bursts) transitions and vice-versa. A subset of the extremum (minimum or maximum) samples between every pair of successive zero-crossings is selected above a second pass threshold, from each bandpass filtered speech signal frame. Relative to the mid-point (reference) of a frame, locations of the first and the last extrema lie on either side, if the speech signal belongs to a homogeneous segment; else, both these locations lie on the left or the right side of the reference, indicating a transition frame. When tested on the entire TIMIT database, of the transitions detected, 93.6% are within a tolerance of 20 ms from the hand labeled boundaries. Sonorant, unvoiced non-sonorant and silence classes and their respective onsets are detected with an accuracy of about 83.5% for the same tolerance. The results are as good as, and in some respects better than the state-of-the-art methods for similar tasks. Comment: 12 pages, 5 figures |
Databáze: | arXiv |
Externí odkaz: |