Zobrazeno 1 - 10
of 186
pro vyhledávání: '"Stylianou, Yannis"'
Autor:
Raitio, Tuomo, Petkov, Petko, Li, Jiangchuan, Shifas, Muhammed, Davis, Andrea, Stylianou, Yannis
We present a neural text-to-speech (TTS) method that models natural vocal effort variation to improve the intelligibility of synthetic speech in the presence of noise. The method consists of first measuring the spectral tilt of unlabeled conventional
Externí odkaz:
http://arxiv.org/abs/2203.10637
In this work, we explore multiple architectures and training procedures for developing a multi-speaker and multi-lingual neural TTS system with the goals of a) improving the quality when the available data in the target language is limited and b) ena
Externí odkaz:
http://arxiv.org/abs/2108.07737
In this work we evaluate a neural based speech intelligibility booster based on spectral shaping and dynamic range compression (SSDRC), referred to as WaveNet-based SSDRC (wSSDRC), using a recently designed Greek Harvard-style corpus. The corpus has
Externí odkaz:
http://arxiv.org/abs/2011.06548
The increased adoption of digital assistants makes text-to-speech (TTS) synthesis systems an indispensable feature of modern mobile devices. It is hence desirable to build a system capable of generating highly intelligible speech in the presence of n
Externí odkaz:
http://arxiv.org/abs/2008.05809
Recent advancements in deep learning led to human-level performance in single-speaker speech synthesis. However, there are still limitations in terms of speech quality when generalizing those systems into multiple-speaker models especially for unseen
Externí odkaz:
http://arxiv.org/abs/2008.05289
Autor:
Abdelaziz, Ahmed Hussen, Kumar, Anushree Prasanna, Seivwright, Chloe, Fanelli, Gabriele, Binder, Justin, Stylianou, Yannis, Kajarekar, Sachin
Audiovisual speech synthesis is the problem of synthesizing a talking face while maximizing the coherency of the acoustic and visual speech. In this paper, we propose and compare two audiovisual speech synthesis systems for 3D face models. The first
Externí odkaz:
http://arxiv.org/abs/2008.00620
Autor:
Pantazis, Yannis, Paul, Dipjyoti, Fasoulakis, Michail, Stylianou, Yannis, Katsoulakis, Markos
In this paper, we propose a novel loss function for training Generative Adversarial Networks (GANs) aiming towards deeper theoretical understanding as well as improved stability and performance for the underlying optimization problem. The new loss fu
Externí odkaz:
http://arxiv.org/abs/2006.06625
Convolutional neural network (CNN) modules are widely being used to build high-end speech enhancement neural models. However, the feature extraction power of vanilla CNN modules has been limited by the dimensionality constraint of the convolution ker
Externí odkaz:
http://arxiv.org/abs/2006.05233
In this paper, we suggest a new parallel, non-causal and shallow waveform domain architecture for speech enhancement based on FFTNet, a neural network for generating high quality audio waveform. In contrast to other waveform based approaches like Wav
Externí odkaz:
http://arxiv.org/abs/2006.04469