The BioVisualSpeech corpus of words with sibilants for speech therapy games development

Autor: Nuno Marques, Maxine Eskenazi, João Magalhães, Sofia Martins, Sofia Cavaco, Mariana Ascensão, Ivo Anjos, Margarida Grilo, Isabel Guimarães, Francisco Roque de Oliveira, Alberto Abad
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Speech production
Computer science
First language
02 engineering and technology
speech sound disorders
Sibilant consonants
computer.software_genre
030507 speech-language pathology & audiology
03 medical and health sciences
European Portuguese
Children’s speech corpus
0202 electrical engineering
electronic engineering
information engineering

lcsh:T58.5-58.64
business.industry
Sibilant
lcsh:Information technology
sibilant consonants
children’s speech corpus
Speech processing
language.human_language
Serious games for speech and language therapy
Word recognition
language
020201 artificial intelligence & image processing
serious games for speech and language therapy
Artificial intelligence
Portuguese
0305 other medical science
business
computer
Classifier (UML)
Speech sound disorders
Natural language processing
Information Systems
Zdroj: Information
Volume 11
Issue 10
Information, Vol 11, Iss 470, p 470 (2020)
Popis: In order to develop computer tools for speech therapy that reliably classify speech productions, there is a need for speech production corpora that characterize the target population in terms of age, gender, and native language. Apart from including correct speech productions, in order to characterize the target population, the corpora should also include samples from people with speech sound disorders. In addition, the annotation of the data should include information on the correctness of the speech productions. Following these criteria, we collected a corpus that can be used to develop computer tools for speech and language therapy of Portuguese children with sigmatism. The proposed corpus contains European Portuguese children&rsquo
s word productions in which the words have sibilant consonants. The corpus has productions from 356 children from 5 to 9 years of age. Some important characteristics of this corpus, that are relevant to speech and language therapy and computer science research, are that (1) the corpus includes data from children with speech sound disorders
and (2) the productions were annotated according to the criteria of speech and language pathologists, and have information about the speech production errors. These are relevant features for the developmentand assessment of speech processing toolsfor speech therapy of Portuguese children. In addition, as an illustration on how to use the corpus, we present three speech therapy games that use a convolutional neural network sibilants classifier trained with data from this corpus and a word recognition module trained on additional children data and calibrated and evaluated with the collected corpus.
Databáze: OpenAIRE