Prediction of Cepstral Excitation Pulses for Voice Conversion

Autor: Fadoua Bahja, Joseph Di Martino, El Hassan Ibn Elhaj, Driss Aboutajdine
Přispěvatelé: Laboratoire de Recherche en Informatique et Télécommunications [Rabat] (GSCM-LRIT), Université Mohammed V de Rabat [Agdal] (UM5), Analysis, perception and recognition of speech (PAROLE), INRIA Lorraine, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS), Institut National des Postes et Télécommunications [Rabat] (INPT), University of Mohammed V
Jazyk: angličtina
Rok vydání: 2012
Předmět:
Zdroj: 5th. International Conference on Information Systems and Economic Intelligence-SIIE ̓ 2012
5th. International Conference on Information Systems and Economic Intelligence-SIIE ̓ 2012, Feb 2012, Djerba, Tunisia
HAL
Popis: International audience; Voice conversion is one of useful techniques to enhance pathological speech to be perceived as normal speech, although it concerns also the modifications of normal source speaker's speech to be perceived as if a target speaker had uttered it. The parameters to be converted are obtained by matching the spectral envelope of the vocal tract for the source and the target speech. Gaussian Mixture Models (GMMs) parameters are determined for providing conversion functions. The main contribution of our study consists in the prediction of Fourier cepstrum coefficients related to the excitation signal. Such a prediction leads to a satisfactory voice conversion system. Subjective perceptual results indicate that the proposed approach yields significant improvements in quality of the converted voice.
Databáze: OpenAIRE