Parallel speech collection for under-resourced language studies using the LIG-AIKUMA mobile device app
Autor: | Laurent Besacier, David Blachon, Guy-Noël Kouarata, Martine Adda-Decker, Annie Rialland, Elodie Gauthier |
---|---|
Přispěvatelé: | Laboratoire d'Informatique de Grenoble (LIG ), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019]), Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole (GETALP ), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019]), Institut Universitaire de France (IUF), Ministère de l'Education nationale, de l’Enseignement supérieur et de la Recherche (M.E.N.E.S.R.), Dynamique Du Langage (DDL), Université Lumière - Lyon 2 (UL2)-Centre National de la Recherche Scientifique (CNRS), LPP - Laboratoire de Phonétique et Phonologie - UMR 7018 (LPP), Université Sorbonne Nouvelle - Paris 3-Centre National de la Recherche Scientifique (CNRS), ANR-13-BS02-0009,ALFFA,Traitement Automatique de la Parole pour les Langues Africaines(2013), ANR-14-CE35-0002,BULB,Breaking the Unwritten Language Barrier(2014) |
Jazyk: | angličtina |
Rok vydání: | 2016 |
Předmět: |
Computer science
Language barrier 02 engineering and technology Language documentation computer.software_genre under-resourced languages documentation [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] Speech collection tool 030507 speech-language pathology & audiology 03 medical and health sciences language documentation 0202 electrical engineering electronic engineering information engineering General Environmental Science language Multimedia business.industry Languages of Africa Metadata General Earth and Planetary Sciences 020201 artificial intelligence & image processing Artificial intelligence Line (text file) 0305 other medical science business computer Mobile device Natural language processing |
Zdroj: | SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES Procedia computer science Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU) Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), May 2016, Yogyakarta, Indonesia. ⟨10.1016/j.procs.2016.04.030⟩ SLTU |
ISSN: | 1877-0509 |
DOI: | 10.1016/j.procs.2016.04.030⟩ |
Popis: | This paper reports on our ongoing efforts to collect speech data in under-resourced or endangered languages of Africa. Data collection is carried out using an improved version of the Android application AIKUMA developed by Steven Bird and colleagues(1). Features were added to the app in order to facilitate the collection of parallel speech data in line with the requirements of the French-German ANR/DFG BULB (Breaking the Unwritten Language Barrier) project. The resulting app, called LIG-AIKUMA, runs on various mobile phones and tablets and proposes a range of different speech collection modes (recording, respeaking, translation and elicitation). LIG-AIKUMA'S improved features include a smart generation and handling of speaker metadata as well as respeaking and parallel audio data mapping. It was used for field data collections in Congo-Brazzaville resulting in a total of over 80 hours of speech. Design issues of the mobile app as well as the use of LIG-AIKUMA during two recording campaigns, are further described in this paper. (C) 2016 Published by Elsevier B.V. |
Databáze: | OpenAIRE |
Externí odkaz: |