LSTM Deep Neural Networks Postfiltering for Improving the Quality of Synthetic Voices
Autor: | John Goddard-Close, Marvin Coto-Jiménez |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2016 |
Předmět: |
FOS: Computer and information sciences
Statistical parametric speech synthesis Sound (cs.SD) Computer science media_common.quotation_subject Speech recognition Speech synthesis computer.software_genre 01 natural sciences Computer Science - Sound 030507 speech-language pathology & audiology 03 medical and health sciences 0103 physical sciences Natural (music) Quality (business) Neural and Evolutionary Computing (cs.NE) Hidden Markov model 010301 acoustics media_common business.industry Deep learning Small footprint Computer Science - Neural and Evolutionary Computing Postfiltering Long short-term memory (LSTM) Hidden Markov Models (HMM) Deep neural networks Artificial intelligence 0305 other medical science business computer |
Zdroj: | Lecture Notes in Computer Science ISBN: 9783319393926 MCPR Pattern Recognition (pp.280-289).Guanajuato, Mexico: Springer, Cham |
Popis: | Recent developments in speech synthesis have produced systems capable of outcome intelligible speech, but now researchers strive to create models that more accurately mimic human voices. One such development is the incorporation of multiple linguistic styles in various languages and accents. HMM-based Speech Synthesis is of great interest to many researchers, due to its ability to produce sophisticated features with small footprint. Despite such progress, its quality has not yet reached the level of the predominant unit-selection approaches that choose and concatenate recordings of real speech. Recent efforts have been made in the direction of improving these systems. In this paper we present the application of Long-Short Term Memory Deep Neural Networks as a Postfiltering step of HMM-based speech synthesis, in order to obtain closer spectral characteristics to those of natural speech. The results show how HMM-voices could be improved using this approach. 5 pages, 5 figures |
Databáze: | OpenAIRE |
Externí odkaz: |