Everyday Speech in the Indian Subcontinent

Autor:	Pathak, Utkarsh, Gunda, Chandra Sai Krishna, Sathiyamoorthy, Sujitha, Agarwal, Keshav, Murthy, Hema A.
Rok vydání:	2024
Předmět:	Computer Science - Computation and Language Computer Science - Sound Electrical Engineering and Systems Science - Audio and Speech Processing I.2.7
Druh dokumentu:	Working Paper
Popis:	India has 1369 languages of which 22 are official. About 13 different scripts are used to represent these languages. A Common Label Set (CLS) was developed based on phonetics to address the issue of large vocabulary of units required in the End to End (E2E) framework for multilingual synthesis. This reduced the footprint of the synthesizer and also enabled fast adaptation to new languages which had similar phonotactics, provided language scripts belonged to the same family. In this paper, we provide new insights into speech synthesis, where the script belongs to one family, while the phonotactics comes from another. Indian language text is first converted to CLS, and then a synthesizer that matches the phonotactics of the language is used. Quality akin to that of a native speaker is obtained for Sanskrit and Konkani with zero adaptation data, using Kannada and Marathi synthesizers respectively. Further, this approach also lends itself seamless code switching across 13 Indian languages and English in a given native speaker's voice. Comment: 5 Pages, 1 Figure, Submitted to ICASSP 2025
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2410.10508 Zobrazit plný text záznamu View this record from Arxiv