Intrinsic Spectral Analysis
Autor: | Aren Jansen, Partha Niyogi |
---|---|
Rok vydání: | 2013 |
Předmět: |
Basis (linear algebra)
Speech recognition Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) Basis function Speech processing Manifold Convolution Computer Science::Sound Distortion Signal Processing Spectrogram Electrical and Electronic Engineering Algorithm Vocal tract Mathematics |
Zdroj: | IEEE Transactions on Signal Processing. 61:1698-1710 |
ISSN: | 1941-0476 1053-587X |
Popis: | It has long been posited that the space of speech sounds is inherently low dimensional, the result of a relatively small number of degrees of freedom involved in the human vocal apparatus. We attempt to formalize this notion by analyzing a simple physical model of the vocal tract and demonstrating that it produces transfer functions whose spectra are restricted to low dimensional manifolds embedded in an infinite dimensional space of square integrable functions. While source convolution and channel distortion precludes analytic recovery of the articulatory configuration from the observed signal, we present a data-driven unsupervised learning algorithm called Intrinsic Spectral Analysis designed to recover from a stream of unannotated and unsegmented audio a set of nonlinear basis functions for the speech manifold. Projecting a traditional spectrogram onto this nonlinear basis defines a novel acoustic representation that is demonstrated to have phonological significance, improved phonetic separability, inherent speaker independence, and complementarity with standard acoustic front-ends. |
Databáze: | OpenAIRE |
Externí odkaz: |