Výsledky vyhledávání - "Huybrechts, Goeric"

Report

DCTX-Conformer: Dynamic context carry-over for low latency unified streaming and non-streaming Conformer ASR

Autor: Huybrechts, Goeric, Ronanki, Srikanth, Li, Xilai, Nosrati, Hadis, Bodapati, Sravan, Kirchhoff, Katrin

Conformer-based end-to-end models have become ubiquitous these days and are commonly used in both streaming and non-streaming automatic speech recognition (ASR). Techniques like dual-mode and dynamic chunk training helped unify streaming and non-stre

Externí odkaz: http://arxiv.org/abs/2306.08175

Zobrazit plný text záznamu

Report

Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR

Autor: Li, Xilai, Huybrechts, Goeric, Ronanki, Srikanth, Farris, Jeff, Bodapati, Sravan

Recently, there has been an increasing interest in unifying streaming and non-streaming speech recognition models to reduce development, training and deployment cost. The best-known approaches rely on either window-based or dynamic chunk-based attent

Externí odkaz: http://arxiv.org/abs/2304.09325

Zobrazit plný text záznamu

Report

Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation

Autor: Comini, Giulia, Huybrechts, Goeric, Ribeiro, Manuel Sam, Gabrys, Adam, Lorenzo-Trueba, Jaime

The availability of data in expressive styles across languages is limited, and recording sessions are costly and time consuming. To overcome these issues, we demonstrate how to build low-resource, neural text-to-speech (TTS) voices with only 1 hour o

Externí odkaz: http://arxiv.org/abs/2207.14607

Zobrazit plný text záznamu

Report

Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module

Autor: Gabryś, Adam, Huybrechts, Goeric, Ribeiro, Manuel Sam, Chien, Chung-Ming, Roth, Julian, Comini, Giulia, Barra-Chicote, Roberto, Perz, Bartek, Lorenzo-Trueba, Jaime

State-of-the-art text-to-speech (TTS) systems require several hours of recorded speech data to generate high-quality synthetic speech. When using reduced amounts of training data, standard TTS models suffer from speech quality and intelligibility deg

Externí odkaz: http://arxiv.org/abs/2202.08164

Zobrazit plný text záznamu

Report

Cross-speaker style transfer for text-to-speech using data augmentation

Autor: Ribeiro, Manuel Sam, Roth, Julian, Comini, Giulia, Huybrechts, Goeric, Gabrys, Adam, Lorenzo-Trueba, Jaime

We address the problem of cross-speaker style transfer for text-to-speech (TTS) using data augmentation via voice conversion. We assume to have a corpus of neutral non-expressive data from a target speaker and supporting conversational expressive dat

Externí odkaz: http://arxiv.org/abs/2202.05083

Zobrazit plný text záznamu

Report

Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech

Autor: Shah, Raahil, Pokora, Kamil, Ezzerg, Abdelhamid, Klimkov, Viacheslav, Huybrechts, Goeric, Putrycz, Bartosz, Korzekwa, Daniel, Merritt, Thomas

Whilst recent neural text-to-speech (TTS) approaches produce high-quality speech, they typically require a large amount of recordings from the target speaker. In previous work, a 3-step method was proposed to generate high-quality TTS while greatly r

Externí odkaz: http://arxiv.org/abs/2106.12896

Zobrazit plný text záznamu

Report

EmoCat: Language-agnostic Emotional Voice Conversion

Autor: Schnell, Bastian, Huybrechts, Goeric, Perz, Bartek, Drugman, Thomas, Lorenzo-Trueba, Jaime

Emotional voice conversion models adapt the emotion in speech without changing the speaker identity or linguistic content. They are less data hungry than text-to-speech models and allow to generate large amounts of emotional data for downstream tasks

Externí odkaz: http://arxiv.org/abs/2101.05695

Zobrazit plný text záznamu

Report

Low-resource expressive text-to-speech using data augmentation

Autor: Huybrechts, Goeric, Merritt, Thomas, Comini, Giulia, Perz, Bartek, Shah, Raahil, Lorenzo-Trueba, Jaime

While recent neural text-to-speech (TTS) systems perform remarkably well, they typically require a substantial amount of recordings from the target speaker reading in the desired speaking style. In this work, we present a novel 3-step methodology to

Externí odkaz: http://arxiv.org/abs/2011.05707

Zobrazit plný text záznamu

Report

Voice Conversion for Whispered Speech Synthesis

Autor: Cotescu, Marius, Drugman, Thomas, Huybrechts, Goeric, Lorenzo-Trueba, Jaime, Moinet, Alexis

We present an approach to synthesize whisper by applying a handcrafted signal processing recipe and Voice Conversion (VC) techniques to convert normally phonated speech to whispered speech. We investigate using Gaussian Mixture Models (GMM) and Deep

Externí odkaz: http://arxiv.org/abs/1912.05289

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání