Multiregression analysis of autoregressive with exogenous input speech synthesis parameters and voice qualities

Autor:	Hiroshi Kido, Masayoshi Kawamata, Hideki Kasuya
Rok vydání:	2004
Předmět:	Range (music) Formant Acoustics and Ultrasonics Arts and Humanities (miscellaneous) Autoregressive model Acoustics Speech synthesis computer.software_genre computer Utterance Sentence Mathematics Glottal flow
Zdroj:	The Journal of the Acoustical Society of America. 116:2545-2545
ISSN:	0001-4966
DOI:	10.1121/1.4785154
Popis:	This study investigates the relationship between acoustic parameters utilized in the formant‐based ARX (autoregressive with exogenous input) speech synthesis model (J. Acoust. Soc. Jpn., 58, 386–397) and perceived voice qualities of synthetic speech. The acoustic parameters manipulated were F0 baseline, F0 range, spectral tilt of glottal flow (TL), formant scaling parameter (FS), and speaking rate (SR). Japanese expressions associated with voice qualities were high‐pitched/low‐pitched, masculine/feminine, hoarse/clear, calm/excited, powerful/weak, youthful/elderly, thick/thin, and tense/lax (Proc. ICSLP‐98, No. 1005). A sentence utterance of an average speaker selected from a database of 109 male speakers was analyzed using the ARX method. Each of the five acoustic parameters of the utterance was manipulated at three levels, producing 243 samples of synthetic speech (3×3×3×3×3). Ten subjects evaluated the voice qualities of each of the 243 synthetic stimuli with regard to the eight Japanese expressions. Multiregression analysis showed that F0 range, F0 baseline, and FS were primary acoustic correlates of high‐pitched/low‐pitched and masculine/feminine, SR and F0 range for calm/excited, and F0 range, SR and F0 baseline for thick/thin. Significant relations were not found for the remainder of the Japanese expressions, which was thought to be associated in part with irregularities of glottal flow.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::44eb060bef36035b07ff147ce5971e28 https://doi.org/10.1121/1.4785154 Zobrazit plný text záznamu