Zobrazeno 1 - 10
of 245
pro vyhledávání: '"Kinoshita, Keisuke"'
Autor:
Zmolikova, Katerina, Delcroix, Marc, Ochiai, Tsubasa, Kinoshita, Keisuke, Černocký, Jan, Yu, Dong
Humans can listen to a target speaker even in challenging acoustic conditions that have noise, reverberation, and interfering speakers. This phenomenon is known as the cocktail-party effect. For decades, researchers have focused on approaching the li
Externí odkaz:
http://arxiv.org/abs/2301.13341
Autor:
von Neumann, Thilo, Boeddeker, Christoph, Kinoshita, Keisuke, Delcroix, Marc, Haeb-Umbach, Reinhold
We propose a general framework to compute the word error rate (WER) of ASR systems that process recordings containing multiple speakers at their input and that produce multiple output word sequences (MIMO). Such ASR systems are typically required, e.
Externí odkaz:
http://arxiv.org/abs/2211.16112
Autor:
Kinoshita, Keisuke, von Neumann, Thilo, Delcroix, Marc, Boeddeker, Christoph, Haeb-Umbach, Reinhold
Recent speaker diarization studies showed that integration of end-to-end neural diarization (EEND) and clustering-based diarization is a promising approach for achieving state-of-the-art performance on various tasks. Such an approach first divides an
Externí odkaz:
http://arxiv.org/abs/2207.13888
Autor:
Sato, Hiroshi, Ochiai, Tsubasa, Delcroix, Marc, Kinoshita, Keisuke, Moriya, Takafumi, Makishima, Naoki, Ihori, Mana, Tanaka, Tomohiro, Masumura, Ryo
Target speech extraction is a technique to extract the target speaker's voice from mixture signals using a pre-recorded enrollment utterance that characterize the voice characteristics of the target speaker. One major difficulty of target speech extr
Externí odkaz:
http://arxiv.org/abs/2206.08174
Autor:
Delcroix, Marc, Kinoshita, Keisuke, Ochiai, Tsubasa, Zmolikova, Katerina, Sato, Hiroshi, Nakatani, Tomohiro
Target speech extraction (TSE) extracts the speech of a target speaker in a mixture given auxiliary clues characterizing the speaker, such as an enrollment utterance. TSE addresses thus the challenging problem of simultaneously performing separation
Externí odkaz:
http://arxiv.org/abs/2204.04811
Autor:
Delcroix, Marc, Vázquez, Jorge Bennasar, Ochiai, Tsubasa, Kinoshita, Keisuke, Ohishi, Yasunori, Araki, Shoko
In many situations, we would like to hear desired sound events (SEs) while being able to ignore interference. Target sound extraction (TSE) tackles this problem by estimating the audio signal of the sounds of target SE classes in a mixture of sounds
Externí odkaz:
http://arxiv.org/abs/2204.03895
Autor:
Yamamoto, Ayako, Irino, Toshio, Araki, Shoko, Arai, Kenichi, Ogawa, Atsunori, Kinoshita, Keisuke, Nakatani, Tomohiro
Publikováno v:
Proc. APSIPA ASC 2022
It is essential to perform speech intelligibility (SI) experiments with human listeners in order to evaluate objective intelligibility measures for developing effective speech enhancement and noise reduction algorithms. Recently, crowdsourced remote
Externí odkaz:
http://arxiv.org/abs/2203.16760
Speaker diarization has been investigated extensively as an important central task for meeting analysis. Recent trend shows that integration of end-to-end neural (EEND)-and clustering-based diarization is a promising approach to handle realistic conv
Externí odkaz:
http://arxiv.org/abs/2202.06524
Autor:
Sato, Hiroshi, Ochiai, Tsubasa, Delcroix, Marc, Kinoshita, Keisuke, Kamo, Naoyuki, Moriya, Takafumi
Publikováno v:
In 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6287-6291
The combination of a deep neural network (DNN) -based speech enhancement (SE) front-end and an automatic speech recognition (ASR) back-end is a widely used approach to implement overlapping speech recognition. However, the SE front-end generates proc
Externí odkaz:
http://arxiv.org/abs/2201.03881
Autor:
Nakatani, Tomohiro, Ikeshita, Rintaro, Kinoshita, Keisuke, Sawada, Hiroshi, Kamo, Naoyuki, Araki, Shoko
This paper develops a framework that can perform denoising, dereverberation, and source separation accurately by using a relatively small number of microphones. It has been empirically confirmed that Independent Vector Analysis (IVA) can blindly sepa
Externí odkaz:
http://arxiv.org/abs/2111.10574