Unsupervised phone segmentation method using delta spectral function

Autor: Hsiao-Chuan Wang, Dac-Thang Hoang
Rok vydání: 2011
Předmět:
Zdroj: 2011 International Conference on Speech Database and Assessments (Oriental COCOSDA).
Popis: Unsupervised phone segmentation means that the phone boundaries in an utterance can be detected without a prior knowledge about the text contents. Usually, a spectral change in the speech signal implies the existence of a phone boundary. In this paper, the Delta Spectral Function (DSF) is defined for each frame to represent the variation of band energy for a specific band. Then a number of bands that give highest DSF values in a frame are chosen to define a measure of spectral change. The chosen bands are not fixed. They are dynamically chosen frame by frame. The peaks of the spectral change curve can be recognized as possible boundaries. A fine tune procedure is then applied to choose the peaks that will be the detected boundaries. Our proposed method results in an F-value of 75.3% under the condition of near zero over segmentation. In this situation the recall rate is 75.3%. This experimental result is better than many previous reports. Besides, the computation is simple and the proposed method is easy to be implemented.
Databáze: OpenAIRE