Zobrazeno 1 - 10
of 195
pro vyhledávání: '"Johnson, Michael T"'
Dysarthria is a motor speech disorder often characterized by reduced speech intelligibility through slow, uncoordinated control of speech production muscles. Automatic Speech recognition (ASR) systems can help dysarthric talkers communicate more effe
Externí odkaz:
http://arxiv.org/abs/2308.08438
Publikováno v:
In Speech Communication October 2024 164
Dysarthria is a motor speech disorder often characterized by reduced speech intelligibility through slow, uncoordinated control of speech production muscles. Automatic Speech recognition (ASR) systems may help dysarthric talkers communicate more effe
Externí odkaz:
http://arxiv.org/abs/2201.11571
As the cornerstone of other important technologies, such as speech recognition and speech synthesis, speech enhancement is a critical area in audio signal processing. In this paper, a new deep learning structure for speech enhancement is demonstrated
Externí odkaz:
http://arxiv.org/abs/2108.12105
Autor:
Bozorg, Narjes, Johnson, Michael T.
This paper presents Articulatory-WaveNet, a new approach for acoustic-to-articulator inversion. The proposed system uses the WaveNet speech synthesis architecture, with dilated causal convolutional layers using previous values of the predicted articu
Externí odkaz:
http://arxiv.org/abs/2006.12594
In this paper, we apply a latent class model (LCM) to the task of speaker diarization. LCM is similar to Patrick Kenny's variational Bayes (VB) method in that it uses soft information and avoids premature hard decisions in its iterations. In contrast
Externí odkaz:
http://arxiv.org/abs/1904.11130
Speaker embeddings achieve promising results on many speaker verification tasks. Phonetic information, as an important component of speech, is rarely considered in the extraction of speaker embeddings. In this paper, we introduce phonetic information
Externí odkaz:
http://arxiv.org/abs/1804.04862
Frame alignments can be computed by different methods in GMM-based speaker verification. By incorporating a phonetic Gaussian mixture model (PGMM), we are able to compare the performance using alignments extracted from the deep neural networks (DNN)
Externí odkaz:
http://arxiv.org/abs/1710.10436
Text-dependent speaker verification is becoming popular in the speaker recognition society. However, the conventional i-vector framework which has been successful for speaker identification and other similar tasks works relatively poorly in this task
Externí odkaz:
http://arxiv.org/abs/1707.04373
Publikováno v:
In Journal of Manufacturing Processes December 2019 48:210-217