Výsledky vyhledávání

Report

Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model

Autor: Vyas, Apoorv, Madikeri, Srikanth, Bourlard, Hervé

In this work, we investigate if the wav2vec 2.0 self-supervised pretraining helps mitigate the overfitting issues with connectionist temporal classification (CTC) training to reduce its performance gap with flat-start lattice-free MMI (E2E-LFMMI) for

Externí odkaz: http://arxiv.org/abs/2104.02558

Zobrazit plný text záznamu

Report

Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic Models

Autor: Vyas, Apoorv, Madikeri, Srikanth, Bourlard, Hervé

In this work, we propose lattice-free MMI (LFMMI) for supervised adaptation of self-supervised pretrained acoustic model. We pretrain a Transformer model on thousand hours of untranscribed Librispeech data followed by supervised adaptation with LFMMI

Externí odkaz: http://arxiv.org/abs/2012.14252

Zobrazit plný text záznamu

Report

Automatic dysarthric speech detection exploiting pairwise distance-based convolutional neural networks

Autor: Janbakhshi, P., Kodrasi, I., Bourlard, H.

Automatic dysarthric speech detection can provide reliable and cost-effective computer-aided tools to assist the clinical diagnosis and management of dysarthria. In this paper we propose a novel automatic dysarthric speech detection approach based on

Externí odkaz: http://arxiv.org/abs/2011.07545

Zobrazit plný text záznamu

Report

Automatic and perceptual discrimination between dysarthria, apraxia of speech, and neurotypical speech

Autor: Kodrasi, I., Pernon, M., Laganaro, M., Bourlard, H.

Automatic techniques in the context of motor speech disorders (MSDs) are typically two-class techniques aiming to discriminate between dysarthria and neurotypical speech or between dysarthria and apraxia of speech (AoS). Further, although such techni

Externí odkaz: http://arxiv.org/abs/2011.07542

Zobrazit plný text záznamu

Report

Pkwrap: a PyTorch Package for LF-MMI Training of Acoustic Models

Autor: Madikeri, Srikanth, Tong, Sibo, Zuluaga-Gomez, Juan, Vyas, Apoorv, Motlicek, Petr, Bourlard, Hervé

We present a simple wrapper that is useful to train acoustic models in PyTorch using Kaldi's LF-MMI training framework. The wrapper, called pkwrap (short form of PyTorch kaldi wrapper), enables the user to utilize the flexibility provided by PyTorch

Externí odkaz: http://arxiv.org/abs/2010.03466

Zobrazit plný text záznamu

Report

Neural Network based End-to-End Query by Example Spoken Term Detection

Autor: Ram, Dhananjay, Miculicich, Lesly, Bourlard, Hervé

This paper focuses on the problem of query by example spoken term detection (QbE-STD) in zero-resource scenario. State-of-the-art approaches primarily rely on dynamic time warping (DTW) based template matching techniques using phone posterior or bott

Externí odkaz: http://arxiv.org/abs/1911.08332

Zobrazit plný text záznamu

Report

Multilingual Bottleneck Features for Query by Example Spoken Term Detection

Autor: Ram, Dhananjay, Miculicich, Lesly, Bourlard, Hervé

State of the art solutions to query by example spoken term detection (QbE-STD) usually rely on bottleneck feature representation of the query and audio document to perform dynamic time warping (DTW) based template matching. Here, we present a study o

Externí odkaz: http://arxiv.org/abs/1907.00443

Zobrazit plný text záznamu

Report

Multilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model

Autor: Tong, Sibo, Garner, Philip N., Bourlard, Hervé

Multilingual models for Automatic Speech Recognition (ASR) are attractive as they have been shown to benefit from more training data, and better lend themselves to adaptation to under-resourced languages. However, initialisation from monolingual cont

Externí odkaz: http://arxiv.org/abs/1711.10025

Zobrazit plný text záznamu

Report

Information Theoretic Analysis of DNN-HMM Acoustic Modeling

Autor: Dighe, Pranay, Asaei, Afsaneh, Bourlard, Hervé

We propose an information theoretic framework for quantitative assessment of acoustic modeling for hidden Markov model (HMM) based automatic speech recognition (ASR). Acoustic modeling yields the probabilities of HMM sub-word states for a short tempo

Externí odkaz: http://arxiv.org/abs/1709.01144

Zobrazit plný text záznamu

Report

Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models

Autor: Dighe, Pranay, Asaei, Afsaneh, Bourlard, Herve

Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the targets for DNN training. Subword classes in speech recognition systems corr

Externí odkaz: http://arxiv.org/abs/1610.05688

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání