LSTM based voice conversion for laryngectomees

Autor:	Inma Hernaez, Xabier Sarasola, Sneha Raman, Ibon Saratxaga, Eva Navas, David Tavarez, Luis Serrano
Přispěvatelé:	European Commission
Jazyk:	angličtina
Rok vydání:	2018
Předmět:	speech and voice disorders voice conversion alaryngeal voices 020206 networking & telecommunications 02 engineering and technology Marie curie 030507 speech-language pathology & audiology 03 medical and health sciences Work (electrical) Pedagogy 0202 electrical engineering electronic engineering information engineering Sociology 0305 other medical science speech intelligibility
Zdroj:	IberSPEECH 2018 Addi. Archivo Digital para la Docencia y la Investigación instname IberSPEECH
Popis:	This paper describes a voice conversion system designed withthe aim of improving the intelligibility and pleasantness of oe-sophageal voices. Two different systems have been built, oneto transform the spectral magnitude and another one for thefundamental frequency, both based on DNNs. Ahocoder hasbeen used to extract the spectral information (mel cepstral co-efficients) and a specific pitch extractor has been developed tocalculate the fundamental frequency of the oesophageal voices.The cepstral coefficients are converted by means of an LSTMnetwork. The conversion of the intonation curve is implementedthrough two different LSTM networks, one dedicated to thevoiced unvoiced detection and another one for the predictionof F0 from the converted cepstral coefficients. The experi-ments described here involve conversion from one oesophagealspeaker to a specific healthy voice. The intelligibility of thesignals has been measured with a Kaldi based ASR system. Apreference test has been implemented to evaluate the subjectivepreference of the obtained converted voices comparing themwith the original oesophageal voice. The results show that spec-tral conversion improves ASR while restoring the intonation ispreferred by human listeners This work has been partially funded by the Spanish Ministryof Economy and Competitiveness with FEDER support (RE-STORE project, TEC2015-67163-C2-1-R), the Basque Govern-ment (BerbaOla project, KK-2018/00014) and from the Euro-pean Unions H2020 research and innovation programme un-der the Marie Curie European Training Network ENRICH(675324).
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e9a457cc1e5799c100aa1476469a4381 http://hdl.handle.net/10810/32818 Zobrazit plný text záznamu