Zobrazeno 1 - 10
of 13
pro vyhledávání: '"Maniati, Georgia"'
Autor:
Mitsios, Michael, Vamvoukakis, Georgios, Maniati, Georgia, Ellinas, Nikolaos, Dimitriou, Georgios, Markopoulos, Konstantinos, Kakoulidis, Panos, Vioni, Alexandra, Christidou, Myrsini, Oh, Junkwang, Jho, Gunu, Hwang, Inchul, Vardaxoglou, Georgios, Chalamandaris, Aimilios, Tsiakoulis, Pirros, Raptis, Spyros
Emotion detection in textual data has received growing interest in recent years, as it is pivotal for developing empathetic human-computer interaction systems. This paper introduces a method for categorizing emotions from text, which acknowledges and
Externí odkaz:
http://arxiv.org/abs/2404.01805
Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations
Autor:
Kakoulidis, Panos, Ellinas, Nikolaos, Vamvoukakis, Georgios, Christidou, Myrsini, Vioni, Alexandra, Maniati, Georgia, Oh, Junkwang, Jho, Gunu, Hwang, Inchul, Tsiakoulis, Pirros, Chalamandaris, Aimilios
In this paper, we propose a singing voice synthesis model, Karaoker-SSL, that is trained only on text and speech data as a typical multi-speaker acoustic model. It is a low-resource pipeline that does not utilize any singing data end-to-end, since it
Externí odkaz:
http://arxiv.org/abs/2402.01520
Autor:
Nikitaras, Karolos, Klapsas, Konstantinos, Ellinas, Nikolaos, Maniati, Georgia, Sung, June Sig, Hwang, Inchul, Raptis, Spyros, Chalamandaris, Aimilios, Tsiakoulis, Pirros
This paper proposes an Expressive Speech Synthesis model that utilizes token-level latent prosodic variables in order to capture and control utterance-level attributes, such as character acting voice and speaking style. Current works aim to explicitl
Externí odkaz:
http://arxiv.org/abs/2211.00523
Autor:
Markopoulos, Konstantinos, Maniati, Georgia, Vamvoukakis, Georgios, Ellinas, Nikolaos, Vardaxoglou, Georgios, Kakoulidis, Panos, Oh, Junkwang, Jho, Gunu, Hwang, Inchul, Chalamandaris, Aimilios, Tsiakoulis, Pirros, Raptis, Spyros
The gender of any voice user interface is a key element of its perceived identity. Recently, there has been increasing interest in interfaces where the gender is ambiguous rather than clearly identifying as female or male. This work addresses the tas
Externí odkaz:
http://arxiv.org/abs/2211.00375
Autor:
Vioni, Alexandra, Maniati, Georgia, Ellinas, Nikolaos, Sung, June Sig, Hwang, Inchul, Chalamandaris, Aimilios, Tsiakoulis, Pirros
Current state-of-the-art methods for automatic synthetic speech evaluation are based on MOS prediction neural models. Such MOS prediction models include MOSNet and LDNet that use spectral features as input, and SSL-MOS that relies on a pretrained sel
Externí odkaz:
http://arxiv.org/abs/2211.00342
Autor:
Ellinas, Nikolaos, Vamvoukakis, Georgios, Markopoulos, Konstantinos, Maniati, Georgia, Kakoulidis, Panos, Sung, June Sig, Hwang, Inchul, Raptis, Spyros, Chalamandaris, Aimilios, Tsiakoulis, Pirros
This paper presents a method for end-to-end cross-lingual text-to-speech (TTS) which aims to preserve the target language's pronunciation regardless of the original speaker's language. The model used is based on a non-attentive Tacotron architecture,
Externí odkaz:
http://arxiv.org/abs/2210.17264
Autor:
Maniati, Georgia, Vioni, Alexandra, Ellinas, Nikolaos, Nikitaras, Karolos, Klapsas, Konstantinos, Sung, June Sig, Jho, Gunu, Chalamandaris, Aimilios, Tsiakoulis, Pirros
In this work, we present the SOMOS dataset, the first large-scale mean opinion scores (MOS) dataset consisting of solely neural text-to-speech (TTS) samples. It can be employed to train automatic MOS prediction systems focused on the assessment of mo
Externí odkaz:
http://arxiv.org/abs/2204.03040
Autor:
Markopoulos, Konstantinos, Ellinas, Nikolaos, Vioni, Alexandra, Christidou, Myrsini, Kakoulidis, Panos, Vamvoukakis, Georgios, Maniati, Georgia, Sung, June Sig, Park, Hyoungmin, Tsiakoulis, Pirros, Chalamandaris, Aimilios
In this paper, a text-to-rapping/singing system is introduced, which can be adapted to any speaker's voice. It utilizes a Tacotron-based multispeaker acoustic model trained on read-only speech data and which provides prosody control at the phoneme le
Externí odkaz:
http://arxiv.org/abs/2111.09146
Autor:
Maniati, Georgia, Ellinas, Nikolaos, Markopoulos, Konstantinos, Vamvoukakis, Georgios, Sung, June Sig, Park, Hyoungmin, Chalamandaris, Aimilios, Tsiakoulis, Pirros
The idea of using phonological features instead of phonemes as input to sequence-to-sequence TTS has been recently proposed for zero-shot multilingual speech synthesis. This approach is useful for code-switching, as it facilitates the seamless utteri
Externí odkaz:
http://arxiv.org/abs/2111.09075
Autor:
Ellinas, Nikolaos, Vamvoukakis, Georgios, Markopoulos, Konstantinos, Chalamandaris, Aimilios, Maniati, Georgia, Kakoulidis, Panos, Raptis, Spyros, Sung, June Sig, Park, Hyoungmin, Tsiakoulis, Pirros
This paper presents an end-to-end text-to-speech system with low latency on a CPU, suitable for real-time applications. The system is composed of an autoregressive attention-based sequence-to-sequence acoustic model and the LPCNet vocoder for wavefor
Externí odkaz:
http://arxiv.org/abs/2111.09052