Verification of an articulatory model with different vowel vocal tract area functions and glottal signals

Autor: SHU-WEI HSU, 許恕瑋
Rok vydání: 2016
Druh dokumentu: 學位論文 ; thesis
Popis: 104
The purpose of this study is to build an articulatory model that employs an equivalent lumped electric circuit and related mathematical function to represent the vocal fold and vocal tract systems based on the physiological data from the literature to simulate individual’s vowel production under normal circumstances. Two vocal tract area functions of vowel production from the magnetic resonance imaging (MRI) studies by researchers of Takemoto group and Story, and two vocal folds models (Rosenberg glottal signal and two-mass model) were used to verify our model. The vocal folds are composed of two symmetrical mucous membranes across the larynx to generate sound through vibration. We simulated the glottal signal with the mathematical functions of Rosenberg’s study and the two-mass model representing the vocal folds as two concatenated mass-spring-damper systems. In this study, the vocal tract system from the glottis to the lips was modeled as a tube with many concatenated sections. Based on the lossless tube model, we were able to employ the variation of volume velocity and sound pressure to build a mathematical vocal tract model. Although this approach is relatively simple, the problem is that the viscous effect from the vocal tract wall on vowel production is ignored. On the contrary, MAEDA proposed a vocal tract model that considered energy consumption on the vocal tract wall and also put forward a way to transform a physical model into an equivalent electric circuit model. With MAEDA’s vocal tract model, it is plausible to simulate the vowel production we want with the glottal signals. In this study, we utilized vocal tract area functions from Story’s (/AA/、/IY/、/UW/、/AE/、/AO/) and Takemoto’s (/a/、/i/、/u/、/e/、/o/) research, to verify our vocal tract model with their corresponding vowels production. Furthermore, we applied Rosenberg and the two-mass model with the MAEDA model and observed what effects would be on the vowel production using different glottal signals. The results showed that both the Rosenberg’s signal and two-mass model have low-pass filter characteristics. However, the frequency response of the two-mass model had more low frequency and less high frequency signals. In combination with our vocal tract model used in this study, these two glottal signals were capable of being used to simulate English and Japanese vowel production, respectively. But when they were used with the vocal tract portion of the DIVA (Directions Into Velocities Articulator, DIVA) model, they were incapable of simulating the correct Japanese vowel due to the formant frequency range limitation defined by the DIVA model. In addition, we verified our articulatory model with the vocal tract area function from Story’s study (vocal tract sections varies from 42 to 46 sections depending on different vowels), and found that the differences for the first three formant frequencies between both studies were -7.4, -2.58, and -0.46%, respectively. Furthermore, the differences between ours and Takemoto’s study (vocal tract sections ranges from 68 to 75 sections depending on different vowels) were only -2.01, 1.99, and -0.75%, respectively. In summary, our model could simulate individual’s vowel production under normal circumstances based on the physiological data from the literature; the accuracy of vowel simulation could be higher as the vocal tract is divided into more sections in our model.
Databáze: Networked Digital Library of Theses & Dissertations