Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview

Autor: Jinyu Li, Peter Bell, Steve Renals, Pawel Swietojanski, Ondrej Klejch, Joachim Fainberg
Jazyk: angličtina
Rok vydání: 2021
Předmět:
FOS: Computer and information sciences
semi-supervised learning
Domain adaptation
Sound (cs.SD)
Computer science
domain adaptation
Speech recognition
02 engineering and technology
accent adaptation
Computer Science - Sound
030507 speech-language pathology & audiology
03 medical and health sciences
speaker embeddings
Approximation error
Audio and Speech Processing (eess.AS)
Stress (linguistics)
structured linear transforms
0202 electrical engineering
electronic engineering
information engineering

FOS: Electrical engineering
electronic engineering
information engineering

Adaptation (computer science)
Hidden Markov model
Computer Science - Computation and Language
Artificial neural network
speech recognition
speaker adaptation
TK1-9971
Accent adaptation
regularization
Model parameter
Computer Science::Sound
Signal Processing
020201 artificial intelligence & image processing
Electrical engineering. Electronics. Nuclear engineering
0305 other medical science
Focus (optics)
Algorithm
Computation and Language (cs.CL)
Electrical Engineering and Systems Science - Audio and Speech Processing
data augmentation
Zdroj: IEEE Open Journal of Signal Processing, Vol 2, Pp 33-66 (2021)
EEE Open Journal of Signal Processing
Bell, P, Fainberg, J, Klejch, O, Li, J, Renals, S & Swietojanski, P 2021, ' Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview ', IEEE Open Journal of Signal Processing, vol. 2, pp. 33-66 . https://doi.org/10.1109/OJSP.2020.3045349
ISSN: 2644-1322
Popis: We present a structured overview of adaptation algorithms for neural network-based speech recognition, considering both hybrid hidden Markov model / neural network systems and end-to-end neural network systems, with a focus on speaker adaptation, domain adaptation, and accent adaptation. The overview characterizes adaptation algorithms as based on embeddings, model parameter adaptation, or data augmentation. We present a meta-analysis of the performance of speech recognition adaptation algorithms, based on relative error rate reductions as reported in the literature.
Comment: Total of 31 pages, 27 figures. Associated repository: https://github.com/pswietojanski/ojsp_adaptation_review_2020
Databáze: OpenAIRE