Parametric Representation for Singing Voice Synthesis: a Comparative Evaluation

Autor:	Thomas Drugman, Tuomo Raitio, Daniel Erro, Onur Babacan, Thierry Dutoit
Rok vydání:	2020
Předmět:	FOS: Computer and information sciences Sound (cs.SD) Computer Science - Computation and Language Computer science Stochastic modelling Speech recognition SIGNAL (programming language) 020206 networking & telecommunications Context (language use) 02 engineering and technology Speech processing Computer Science - Sound Singing voice synthesis 030507 speech-language pathology & audiology 03 medical and health sciences Noise Audio and Speech Processing (eess.AS) 0202 electrical engineering electronic engineering information engineering Harmonic FOS: Electrical engineering electronic engineering information engineering 0305 other medical science Representation (mathematics) Computation and Language (cs.CL) Parametric statistics Electrical Engineering and Systems Science - Audio and Speech Processing
Zdroj:	ICASSP
DOI:	10.48550/arxiv.2006.04142
Popis:	Various parametric representations have been proposed to model the speech signal. While the performance of such vocoders is well-known in the context of speech processing, their extrapolation to singing voice synthesis might not be straightforward. The goal of this paper is twofold. First, a comparative subjective evaluation is performed across four existing techniques suitable for statistical parametric synthesis: traditional pulse vocoder, Deterministic plus Stochastic Model, Harmonic plus Noise Model and GlottHMM. The behavior of these techniques as a function of the singer type (baritone, counter-tenor and soprano) is studied. Secondly, the artifacts occurring in high-pitched voices are discussed and possible approaches to overcome them are suggested.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::7f286ec04be564504f6efcdf48b2946f Zobrazit plný text záznamu