Zobrazeno 1 - 10
of 489
pro vyhledávání: '"Wu, Yuning"'
In speech generation tasks, human subjective ratings, usually referred to as the opinion score, are considered the "gold standard" for speech quality evaluation, with the mean opinion score (MOS) serving as the primary evaluation metric. Due to the h
Externí odkaz:
http://arxiv.org/abs/2406.10911
Discrete representation has shown advantages in speech generation tasks, wherein discrete tokens are derived by discretizing hidden features from self-supervised learning (SSL) pre-trained models. However, the direct application of speech SSL models
Externí odkaz:
http://arxiv.org/abs/2406.08905
Singing Voice Synthesis (SVS) has witnessed significant advancements with the advent of deep learning techniques. However, a significant challenge in SVS is the scarcity of labeled singing voice data, which limits the effectiveness of supervised lear
Externí odkaz:
http://arxiv.org/abs/2406.08761
Recent advancements in speech synthesis witness significant benefits by leveraging discrete tokens extracted from self-supervised learning (SSL) models. Discrete tokens offer higher storage efficiency and greater operability in intermediate represent
Externí odkaz:
http://arxiv.org/abs/2406.08416
Autor:
Chang, Xuankai, Shi, Jiatong, Tian, Jinchuan, Wu, Yuning, Tang, Yuxun, Wu, Yihan, Watanabe, Shinji, Adi, Yossi, Chen, Xie, Jin, Qin
Representing speech and audio signals in discrete units has become a compelling alternative to traditional high-dimensional feature vectors. Numerous studies have highlighted the efficacy of discrete units in various applications such as speech compr
Externí odkaz:
http://arxiv.org/abs/2406.07725
Autor:
Akin, Ömer, Wu, Yuning
This paper explores the paradoxical nature of computational creativity, focusing on the inherent limitations of closed digital systems in emulating the open-ended, dynamic process of human creativity. Through a comprehensive analysis, we delve into t
Externí odkaz:
http://arxiv.org/abs/2404.15303
In the dynamic construction industry, traditional robotic integration has primarily focused on automating specific tasks, often overlooking the complexity and variability of human aspects in construction workflows. This paper introduces a human-cente
Externí odkaz:
http://arxiv.org/abs/2403.19060
Autor:
Shi, Jiatong, Lin, Yueqian, Bai, Xinyi, Zhang, Keyi, Wu, Yuning, Tang, Yuxun, Yu, Yifeng, Jin, Qin, Watanabe, Shinji
In singing voice synthesis (SVS), generating singing voices from musical scores faces challenges due to limited data availability. This study proposes a unique strategy to address the data scarcity in SVS. We employ an existing singing voice synthesi
Externí odkaz:
http://arxiv.org/abs/2401.17619
There has been a growing interest in using end-to-end acoustic models for singing voice synthesis (SVS). Typically, these models require an additional vocoder to transform the generated acoustic features into the final waveform. However, since the ac
Externí odkaz:
http://arxiv.org/abs/2308.02867
Publikováno v:
Policing: An International Journal, 2024, Vol. 47, Issue 4, pp. 663-681.