Zobrazeno 1 - 10
of 27
pro vyhledávání: '"Keskin, Gokce"'
Autor:
Keskin, Gokce, Wu, Minhua, King, Brian, Mallidi, Harish, Gao, Yang, Droppo, Jasha, Rastrow, Ariya, Maas, Roland
Automatic speech recognition (ASR) models are typically designed to operate on a single input data type, e.g. a single or multi-channel audio streamed from a device. This design decision assumes the primary input data source does not change and if an
Externí odkaz:
http://arxiv.org/abs/2106.02750
Autor:
Pulugundla, Bhargav, Gao, Yang, King, Brian, Keskin, Gokce, Mallidi, Harish, Wu, Minhua, Droppo, Jasha, Maas, Roland
Attention-based beamformers have recently been shown to be effective for multi-channel speech recognition. However, they are less capable at capturing local information. In this work, we propose a 2D Conv-Attention module which combines convolution n
Externí odkaz:
http://arxiv.org/abs/2105.05920
Autor:
Hu, Hu, Yang, Xuesong, Raeesy, Zeynab, Guo, Jinxi, Keskin, Gokce, Arsikere, Harish, Rastrow, Ariya, Stolcke, Andreas, Maas, Roland
Accents mismatching is a critical problem for end-to-end ASR. This paper aims to address this problem by building an accent-robust RNN-T system with domain adversarial training (DAT). We unveil the magic behind DAT and provide, for the first time, a
Externí odkaz:
http://arxiv.org/abs/2012.07353
Publikováno v:
Proc. Interspeech 2019 (2019): 729-733
In this work we introduce a semi-supervised approach to the voice conversion problem, in which speech from a source speaker is converted into speech of a target speaker. The proposed method makes use of both parallel and non-parallel utterances from
Externí odkaz:
http://arxiv.org/abs/1910.00067
CPU branch prediction has hit a wall--existing techniques achieve near-perfect accuracy on 99% of static branches, and yet the mispredictions that remain hide major performance gains. In a companion report, we show that a primary source of mispredict
Externí odkaz:
http://arxiv.org/abs/1906.09889
This paper evaluates the effectiveness of a Cycle-GAN based voice converter (VC) on four speaker identification (SID) systems and an automated speech recognition (ASR) system for various purposes. Audio samples converted by the VC model are classifie
Externí odkaz:
http://arxiv.org/abs/1905.12531
Publikováno v:
ICASSP 2019
We present a rapid design methodology that combines automated hyper-parameter tuning with semi-supervised training to build highly accurate and robust models for voice commands classification. Proposed approach allows quick evaluation of network arch
Externí odkaz:
http://arxiv.org/abs/1905.04230
Autor:
Ocal, Orhan, Elibol, Oguz H., Keskin, Gokce, Stephenson, Cory, Thomas, Anil, Ramchandran, Kannan
We present a method for converting the voices between a set of speakers. Our method is based on training multiple autoencoder paths, where there is a single speaker-independent encoder and multiple speaker-dependent decoders. The autoencoders are tra
Externí odkaz:
http://arxiv.org/abs/1905.03864
We present a Cycle-GAN based many-to-many voice conversion method that can convert between speakers that are not in the training set. This property is enabled through speaker embeddings generated by a neural network that is jointly trained with the C
Externí odkaz:
http://arxiv.org/abs/1905.02525
Autor:
Keskin, Gokce1, Kahraman Koytak, Pinar1, Bastan, Birgul2, Tanridag, Tulin1, Us, Onder1, Uluc, Kayihan1 kayihanu@yahoo.com
Publikováno v:
Neurological Sciences. Jun2015, Vol. 36 Issue 6, p883-888. 6p. 3 Charts.