A subspace based progressive coding method for speech compression
Autor: | Serkan Keser, Ömer Nezih Gerek, Erol Seke, Mehmet Bilginer Gulmezoglu |
---|---|
Přispěvatelé: | Anadolu Üniversitesi, Mühendislik Fakültesi, Elektrik ve Elektronik Mühendisliği Bölümü, Gerek, Ömer Nezih, Kırşehir Ahi Evran Üniversitesi, Teknik Bilimler Meslek Yüksekokulu, Elektrik ve Otomasyon Bölümü |
Rok vydání: | 2017 |
Předmět: |
Linguistics and Language
Computer science Speech recognition Speech coding TIMIT 02 engineering and technology Speech Codecs Language and Linguistics 020401 chemical engineering 0202 electrical engineering electronic engineering information engineering Codec 0204 chemical engineering Karhunen–Loève theorem Subspace Methods Communication Vector quantization 020206 networking & telecommunications Independent Component Analysis (Ica) Independent component analysis Computer Science Applications Modeling and Simulation Computer Vision and Pattern Recognition Karhunen Loeve Transform (Klt) Software Subspace topology Coding (social sciences) |
Zdroj: | Speech Communication. 94:50-61 |
ISSN: | 0167-6393 |
Popis: | WOS: 000414819300005 In this study, two novel methods, which are based on Karhunen Loeve Transform (KLT) and Independent Component Analysis (ICA), are proposed for coding of speech signals. Instead of immediately dealing with eigenvalue magnitudes, the KLT- and ICA-based methods use eigenvectors of covariance matrices (or independent components for ICA) by geometrically grouping these vectors into fewer numbers of vectors. In this way, a data representation compaction is achieved. Further compression is achieved through discarding autocovariance eigenvectors corresponding to the small eigenvalues and applying vector quantization on the remaining eigenvectors. Additionally, this study proposes an iterative error refinement process, which uses the rest of the available bandwidth in order to transmit an efficient representation of the description error for better SNR. The overall process constitutes a new approach to efficient speech coding, with ICA being used in subspace speech coding for the first time. Constant bit rate (CBR) and variable bit rate (VBR) coding algorithms are employed with the proposed methods. TIMIT speech database is used in the experimental studies. Speech signals are synthesized at 2.4 kbps, 8 kbps, 12.2 kbps, 16 kbps, 16.4kbps and 19.85 kbps rates by using various frame lengths. The qualities of synthesized speech signals are compared to those of available speech codecs, i.e., LPC (2.4 kbps), G.728 (LD-CELP, 16 kbps), G.729A (CS-CELP, 8 kbps), EVS (16.4 kbps), AMR-NB (12.2 kbps) and AMR-WB (19.85 kbps) |
Databáze: | OpenAIRE |
Externí odkaz: |