Eyra - Speech Data Acquisition System for Many Languages
Autor: | Simon Klüpfel, Matthías Pétursson, Jon Gudnason |
---|---|
Rok vydání: | 2016 |
Předmět: |
Audio mining
business.product_category Computer science speech resource collection Interface (computing) automatic speech recognition ComputerApplications_COMPUTERSINOTHERSYSTEMS 020206 networking & telecommunications 02 engineering and technology under-resourced languages computer.software_genre Data acquisition Human–computer interaction 0202 electrical engineering electronic engineering information engineering Internet access Operating system General Earth and Planetary Sciences 020201 artificial intelligence & image processing Speech analytics internationalization business computer General Environmental Science |
Zdroj: | SLTU |
ISSN: | 1877-0509 |
DOI: | 10.1016/j.procs.2016.04.029 |
Popis: | Speech data acquisition is particularly important for under-resourced languages. The data gathering is the most labour-intensive part of developing speech technologies such as automatic speech recognizers and synthesizers. It is therefore important to facilitate this process with as much automation and labour-cutting tools as possible. This paper describes a new open-source system called Eyra which enables distributed speech data collecting through a variety of devices. It addresses internet connectivity issues by allowing the data collectors to run the back-end server off a local laptop, thereby facilitating automatic quality control and less labour-intensive data uploading and compiling. It can also be used in a crowd-sourcing set-up where volunteers can donate voice samples through a desktop web-browser interface. An initial test shows that the system works well in an offline mode using smart-phones for data collection. |
Databáze: | OpenAIRE |
Externí odkaz: |