Formant speech synthesis: improving production quality

Autor:	Donald G. Childers, N.B. Pinto, Ajit L. Lalwani
Rok vydání:	1989
Předmět:	Voice activity detection Formant Computer science Speech recognition Signal Processing Speech coding Pitch detection algorithm Speech synthesis computer.software_genre Speech processing Linear predictive coding computer Voice analysis
Zdroj:	IEEE Transactions on Acoustics, Speech, and Signal Processing. 37:1870-1887
ISSN:	0096-3518
DOI:	10.1109/29.45534
Popis:	The authors describe analysis and synthesis methods for improving the quality of speech produced by D.H. Klatt's (J. Acoust. Soc. Am., vol.67, p.971-95, 1980) software formant synthesizer. Synthetic speech generated using an excitation waveform resembling the glotal volume-velocity was found to be perceptually preferred over speech synthesized using other types of excitation. In addition, listeners ranked speech tokens synthesized with an excitation waveform that simulated the effects of source-tract interaction higher in neutralness than tokens synthesized without such interaction. A series of algorithms for silent and voiced/unvoiced/mixed excitation interval classification, pitch detection, formant estimation and formant tracking was developed. The algorithms can utilize two channels of input data, i.e., speech and electroglottographic signals, and can therefore surpass the performance of single-channel (acoustic-signal-based) algorithms. The formant synthesizer was used to study some aspects of the acoustic correlates of voice quality, e.g., male/female voice conversion and the simulation of breathiness, roughness, and vocal fry. >
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::e2818ca5eefc139f6c057e41d6007e59 https://doi.org/10.1109/29.45534 Zobrazit plný text záznamu