M2ASR-KIRGHIZ: A Free Kirghiz Speech Database and Accompanied Baselines

Autor: Ikram Mamtimin, Wenqiang Du, Askar Hamdulla
Jazyk: angličtina
Rok vydání: 2023
Předmět:
Zdroj: Information, Vol 14, Iss 1, p 55 (2023)
Druh dokumentu: article
ISSN: 2078-2489
DOI: 10.3390/info14010055
Popis: Deep learning has significantly boosted the performance improvement of automatic speech recognition (ASR) with the cooperation of large amounts of data resources. For minority languages, however, there are almost no large-scale data resources, limiting the development of ASR technologies in these languages. In this paper, we publish a free Kirghiz speech database accompanied by associated language resources. The entire database involves 128 h of speech data from 163 speakers and corresponding transcriptions. To our knowledge, this is the largest Kirghiz speech database that is dedicated to the ASR task and is publicly free so far. In addition, we also provide several baseline systems based on Kaldi and WeNet to demonstrate how these public data resources can be used to facilitate the Kirghiz ASR research. This publication is a part of the M2ASR project, and all the resources can be downloaded at the project webpage.
Databáze: Directory of Open Access Journals
Nepřihlášeným uživatelům se plný text nezobrazuje