Parallel speech collection for under-resourced language studies using the LIG-AIKUMA mobile device app

Autor: Laurent Besacier, David Blachon, Guy-Noël Kouarata, Martine Adda-Decker, Annie Rialland, Elodie Gauthier
Přispěvatelé: Laboratoire d'Informatique de Grenoble (LIG ), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019]), Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole (GETALP ), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019]), Institut Universitaire de France (IUF), Ministère de l'Education nationale, de l’Enseignement supérieur et de la Recherche (M.E.N.E.S.R.), Dynamique Du Langage (DDL), Université Lumière - Lyon 2 (UL2)-Centre National de la Recherche Scientifique (CNRS), LPP - Laboratoire de Phonétique et Phonologie - UMR 7018 (LPP), Université Sorbonne Nouvelle - Paris 3-Centre National de la Recherche Scientifique (CNRS), ANR-13-BS02-0009,ALFFA,Traitement Automatique de la Parole pour les Langues Africaines(2013), ANR-14-CE35-0002,BULB,Breaking the Unwritten Language Barrier(2014)
Jazyk: angličtina
Rok vydání: 2016
Předmět:
Zdroj: SLTU-2016 5TH WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGIES FOR UNDER-RESOURCED LANGUAGES
Procedia computer science
Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU)
Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU), May 2016, Yogyakarta, Indonesia. ⟨10.1016/j.procs.2016.04.030⟩
SLTU
ISSN: 1877-0509
DOI: 10.1016/j.procs.2016.04.030⟩
Popis: This paper reports on our ongoing efforts to collect speech data in under-resourced or endangered languages of Africa. Data collection is carried out using an improved version of the Android application AIKUMA developed by Steven Bird and colleagues(1). Features were added to the app in order to facilitate the collection of parallel speech data in line with the requirements of the French-German ANR/DFG BULB (Breaking the Unwritten Language Barrier) project. The resulting app, called LIG-AIKUMA, runs on various mobile phones and tablets and proposes a range of different speech collection modes (recording, respeaking, translation and elicitation). LIG-AIKUMA'S improved features include a smart generation and handling of speaker metadata as well as respeaking and parallel audio data mapping. It was used for field data collections in Congo-Brazzaville resulting in a total of over 80 hours of speech. Design issues of the mobile app as well as the use of LIG-AIKUMA during two recording campaigns, are further described in this paper. (C) 2016 Published by Elsevier B.V.
Databáze: OpenAIRE