Towards Spike-Based Speech Processing: A Biologically Plausible Approach to Simple Acoustic Classification
Autor: | Ismail Uysal, John G. Harris, Harsha M. Sathyendra |
---|---|
Rok vydání: | 2008 |
Předmět: |
Speech perception
Computer science Liquid state machine Applied Mathematics Speech recognition Bandwidth (signal processing) Speech processing medicine.anatomical_structure Computer Science (miscellaneous) medicine Auditory system Mel-frequency cepstrum Neural coding Engineering (miscellaneous) Coding (social sciences) |
Zdroj: | International Journal of Applied Mathematics and Computer Science. 18:129-137 |
ISSN: | 1641-876X |
Popis: | Towards Spike-Based Speech Processing: A Biologically Plausible Approach to Simple Acoustic ClassificationShortcomings of automatic speech recognition (ASR) applications are becoming more evident as they are more widely used in real life. The inherent non-stationarity associated with the timing of speech signals as well as the dynamical changes in the environment make the ensuing analysis and recognition extremely difficult. Researchers often turn to biology seeking clues to make better engineered systems, and ASR is no exception with the usage of feature sets such as Mel frequency cepstral coefficients, which employ filter banks similar to cochlear filter banks in frequency distribution and bandwidth. In this paper, we delve deeper into the mechanics of the human auditory system to take this biological inspiration to the next level. The main goal of this research is to investigate the computation potential of spike trains produced at the early stages of the auditory system for a simple acoustic classification task. First, various spike coding schemes from temporal to rate coding are explored, together with various spike-based encoders with various simplicity levels such as rank order coding and liquid state machine. Based on these findings, a biologically plausible system architecture is proposed for the recognition of phonetically simple acoustic signals which makes exclusive use of spikes for computation. The performance tests show superior performance on a noisy vowel data set when compared with a conventional ASR system. |
Databáze: | OpenAIRE |
Externí odkaz: |