Voice Restoration After Laryngectomy Based on Magnetic Sensing of Articulator Movement and Statistical Articulation-to-Speech Conversion
Autor: | José A. González, James M. Gilbert, Jie Bai, Phil D. Green, Lam Aun Cheah, Stephen R. Ell, Roger K. Moore |
---|---|
Rok vydání: | 2017 |
Předmět: |
Computer science
Movement (music) Speech recognition Articulator medicine.medical_treatment Speech synthesis Statistical model Speech processing computer.software_genre 01 natural sciences Voice analysis Laryngectomy 030507 speech-language pathology & audiology 03 medical and health sciences 0103 physical sciences medicine 0305 other medical science Articulation (phonetics) 010301 acoustics computer |
Zdroj: | Biomedical Engineering Systems and Technologies ISBN: 9783319547169 BIOSTEC (Selected Papers) |
DOI: | 10.1007/978-3-319-54717-6_17 |
Popis: | In this work, we present a silent speech system that is able to generate audible speech from captured movement of speech articulators. Our goal is to help laryngectomy patients, i.e. patients who have lost the ability to speak following surgical removal of the larynx most frequently due to cancer, to recover their voice. In our system, we use a magnetic sensing technique known as Permanent Magnet Articulography (PMA) to capture the movement of the lips and tongue by attaching small magnets to the articulators and monitoring the magnetic field changes with sensors close to the mouth. The captured sensor data is then transformed into a sequence of speech parameter vectors from which a time-domain speech signal is finally synthesised. The key component of our system is a parametric transformation which represents the PMA-to-speech mapping. Here, this transformation takes the form of a statistical model (a mixture of factor analysers, more specifically) whose parameters are learned from simultaneous recordings of PMA and speech signals acquired before laryngectomy. To evaluate the performance of our system on voice reconstruction, we recorded two PMA-and-speech databases with different phonetic complexity for several non-impaired subjects. Results show that our system is able to synthesise speech that sounds as the original voice of the subject and also is intelligible. However, more work still need to be done to achieve a consistent synthesis for phonetically-rich vocabularies. |
Databáze: | OpenAIRE |
Externí odkaz: |