Zobrazeno 1 - 10
of 11
pro vyhledávání: '"Motoi Omachi"'
Publikováno v:
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
Publikováno v:
Interspeech 2021.
Neural end-to-end (E2E) models have become a promising technique to realize practical automatic speech recognition (ASR) systems. When realizing such a system, one important issue is the segmentation of audio to deal with streaming input or long reco
Autor:
Xuankai Chang, Motoi Omachi, Aswin Shanmugam Subramanian, Shinji Watanabe, Yuya Fujita, Pengcheng Guo
Publikováno v:
INTERSPEECH
Publikováno v:
INTERSPEECH
End-to-end (E2E) models have gained attention in the research field of automatic speech recognition (ASR). Many E2E models proposed so far assume left-to-right autoregressive generation of an output token sequence except for connectionist temporal cl
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8cf152c1d9616b3ec8695a578fe39351
http://arxiv.org/abs/2005.13211
http://arxiv.org/abs/2005.13211
Publikováno v:
IEEE/ACM Transactions on Audio, Speech, and Language Processing. 25:637-650
We propose a blind source separation method that yields high-quality speech with low distortion. Time-frequency (TF) masking can effectively reduce interference, but it produces nonlinear distortion. By contrast, linear filtering using a separation m
Publikováno v:
ICASSP
End-to-end (E2E) automatic speech recognition (ASR) with sequence-to-sequence models has gained attention because of its simple model training compared with conventional hidden Markov model based ASR. Recently, several studies report the state-of-the
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f42f277e4c7a13f3e27d78b8e032e6f3
http://arxiv.org/abs/1912.11793
http://arxiv.org/abs/1912.11793
Publikováno v:
SLT
This paper addresses the problem of automatic speech recognition (ASR) of a target speaker in background speech. The novelty of our approach is that we focus on a wakeup keyword, which is usually used for activating ASR systems like smart speakers. T
Publikováno v:
ICASSP
Simply feeding of a last hidden layer of the deep neural network (DNN) back to the input layer recently found to be effective for noise robust acoustic modeling. Such high level feature strengthens the robustness of DNN based acoustic model while pay
Publikováno v:
EUSIPCO
A source signal is estimated using an associative memory model (AMM) and used for separation matrix optimization in linear blind source separation (BSS) to yield high quality and less distorted speech. Linear-filtering-based BSS, such as independent
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7615a7aa122672c9f033cac8b3bb5677
Publikováno v:
INTERSPEECH