Pashto Spoken Digits database for the automatic speech recognition research.

Autor: Abbas, Arbab Waseem, Ahmad, Nasir, Ali, Hazrat
Zdroj: 18th International Conference on Automation & Computing (ICAC); 1/ 1/2012, p1-5, 5p
Abstrakt: This paper presents the development of a Pashto Spoken Digits database for the automatic speech recognition research. This is, to the best of author's knowledge, the first Pashto isolated digits database. The database consists of Pashto digits from zero (sefer) to hundred (sul) uttered by sixty speakers, 30 male and 30 female. The speakers included are having ages, ranging from 18 to 60 years. The recordings are performed in a noise-free environment using Sony PCM-M 10 Linear Recorder. The audio is stored in .wav format and transferred to the laptop via an usb cable. After editing, the audio is split into individual digits using Adobe Audition ver. 1.0. The isolated digit recognition experiments are then performed on a subset of the database containing first 11 digits and 18 speakers. Mel Frequency Cepstral Coefficients (MFCC) were used as the feature vector while Linear Discriminent Analysis (LDA) based classifier was used for the classification. [ABSTRACT FROM PUBLISHER]
Databáze: Complementary Index