A survey of voice conversion based on non-parallel data

Autor: Pengcheng LI, Xulong ZHANG, Jianzong WANG, Ning CHENG, Jing XIAO
Jazyk: čínština
Rok vydání: 2024
Předmět:
Zdroj: 大数据, Vol 10, Pp 65-81 (2024)
Druh dokumentu: article
ISSN: 2096-0271
DOI: 10.11959/j.issn.2096-0271.2024011
Popis: Voice conversion is a research topic in the fields of speech and artificial intelligence.The goal of voice conversion is to change the timbre of speech while preserving the content of the source speech, making it sounds like spoken by the target speaker.It is essential to ensure both the quality and naturalness of the converted speech.Voice conversion based on nonparallel data gains much attention currently, where models are trained using non-parallel multilingual speaker datasets, enabling many-to-many and any-to-any voice conversions.This paper provides a comprehensive summary and analysis of recent developments in non-parallel voice conversion.Firstly, we outline the early voice conversion techniques based on parallel corpus and their limitations.Then, we introduce and compare various approaches to voice conversion based on nonparallel data, providing a thorough analysis.Finally, a summary and outlook on voice conversion technology is provided.
Databáze: Directory of Open Access Journals