Block encoding of speech spectral principal components.

Autor: Holland, James R., Zahorian, Stephen A.
Zdroj: Journal of the Acoustical Society of America; 1984, Vol. 75 Issue S1, pS59-S59, 1p
Abstrakt: A Karhunen-Loeve (KL) series expansion was used to block encode speech spectral principal components as a function of time. Each of ten principal components was first obtained as a linear combination of 20 speech spectral band energies. Using a fixed block length of ten frames (0.128 s), the KL basis vectors were computed separately for various speakers for each principal component. However, the optimal KL basis vector set was essentially the same for each principal component and for the different speakers. The basis vector set also closely resembled a cosine basis vector set. Approximately 94% of the variance of the principal components was accounted for by five (out of ten) basis vectors. Speech was synthesized using the KL basis vectors for block encoding of ten-frame blocks of principal components. Informal listening tests indicate that very little information is lost using five basis vectors. These results indicate that speech spectral principal components, particularly the low-ordered ones which reflect the overall spectral shape, are highly correlated in time. [Work supported by NSF.] [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index