Výsledky vyhledávání - "Cooper, Erica"

Report

The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction

Autor: Huang, Wen-Chin, Fu, Szu-Wei, Cooper, Erica, Zezario, Ryandhimas E., Toda, Tomoki, Wang, Hsin-Min, Yamagishi, Junichi, Tsao, Yu

We present the third edition of the VoiceMOS Challenge, a scientific initiative designed to advance research into automatic prediction of human speech ratings. There were three tracks. The first track was on predicting the quality of ``zoomed-in'' hi

Externí odkaz: http://arxiv.org/abs/2409.07001

Zobrazit plný text záznamu

Report

Spoofing-Aware Speaker Verification Robust Against Domain and Channel Mismatches

Autor: Zeng, Chang, Miao, Xiaoxiao, Wang, Xin, Cooper, Erica, Yamagishi, Junichi

In real-world applications, it is challenging to build a speaker verification system that is simultaneously robust against common threats, including spoofing attacks, channel mismatch, and domain mismatch. Traditional automatic speaker verification (

Externí odkaz: http://arxiv.org/abs/2409.06327

Zobrazit plný text záznamu

Report

An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios

Autor: Gong, Cheng, Cooper, Erica, Wang, Xin, Qiang, Chunyu, Geng, Mengzhe, Wells, Dan, Wang, Longbiao, Dang, Jianwu, Tessier, Marc, Pine, Aidan, Richmond, Korin, Yamagishi, Junichi

Self-supervised learning (SSL) representations from massively multilingual models offer a promising solution for low-resource language speech tasks. Despite advancements, language adaptation in TTS systems remains an open problem. This paper explores

Externí odkaz: http://arxiv.org/abs/2406.08911

Zobrazit plný text záznamu

Report

Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems

Autor: Chen, Zhengyang, Liu, Xuechen, Cooper, Erica, Yamagishi, Junichi, Qian, Yanmin

This paper proposes a speech synthesis system that allows users to specify and control the acoustic characteristics of a speaker by means of prompts describing the speaker's traits of synthesized speech. Unlike previous approaches, our method utilize

Externí odkaz: http://arxiv.org/abs/2406.08812

Zobrazit plný text záznamu

Report

Spoof Diarization: 'What Spoofed When' in Partially Spoofed Audio

Autor: Zhang, Lin, Wang, Xin, Cooper, Erica, Diez, Mireia, Landini, Federico, Evans, Nicholas, Yamagishi, Junichi

This paper defines Spoof Diarization as a novel task in the Partial Spoof (PS) scenario. It aims to determine what spoofed when, which includes not only locating spoof regions but also clustering them according to different spoofing methods. As a pio

Externí odkaz: http://arxiv.org/abs/2406.07816

Zobrazit plný text záznamu

Report

Uncertainty as a Predictor: Leveraging Self-Supervised Learning for Zero-Shot MOS Prediction

Autor: Ravuri, Aditya, Cooper, Erica, Yamagishi, Junichi

Predicting audio quality in voice synthesis and conversion systems is a critical yet challenging task, especially when traditional methods like Mean Opinion Scores (MOS) are cumbersome to collect at scale. This paper addresses the gap in efficient au

Externí odkaz: http://arxiv.org/abs/2312.15616

Zobrazit plný text záznamu

Report

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations

Autor: Gong, Cheng, Wang, Xin, Cooper, Erica, Wells, Dan, Wang, Longbiao, Dang, Jianwu, Richmond, Korin, Yamagishi, Junichi

Neural text-to-speech (TTS) has achieved human-like synthetic speech for single-speaker, single-language synthesis. Multilingual TTS systems are limited to resource-rich languages due to the lack of large paired text and studio-quality audio data. TT

Externí odkaz: http://arxiv.org/abs/2312.14398

Zobrazit plný text záznamu

Report

Speaker-Text Retrieval via Contrastive Learning

Autor: Liu, Xuechen, Wang, Xin, Cooper, Erica, Miao, Xiaoxiao, Yamagishi, Junichi

In this study, we introduce a novel cross-modal retrieval task involving speaker descriptions and their corresponding audio samples. Utilizing pre-trained speaker and text encoders, we present a simple learning framework based on contrastive learning

Externí odkaz: http://arxiv.org/abs/2312.06055

Zobrazit plný text záznamu

Report

Autor: Yadav, Hemant, Cooper, Erica, Yamagishi, Junichi, Sitaram, Sunayana, Shah, Rajiv Ratn

This paper introduces a novel objective function for quality mean opinion score (MOS) prediction of unseen speech synthesis systems. The proposed function measures the similarity of relative positions of predicted MOS values, in a mini-batch, rather

Externí odkaz: http://arxiv.org/abs/2310.05078

Zobrazit plný text záznamu

Report

The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains

Autor: Cooper, Erica, Huang, Wen-Chin, Tsao, Yu, Wang, Hsin-Min, Toda, Tomoki, Yamagishi, Junichi

We present the second edition of the VoiceMOS Challenge, a scientific event that aims to promote the study of automatic prediction of the mean opinion score (MOS) of synthesized and processed speech. This year, we emphasize real-world and challenging

Externí odkaz: http://arxiv.org/abs/2310.02640

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání