Výsledky vyhledávání - "Minematsu, Nobuaki"

Report

A Pilot Study of Applying Sequence-to-Sequence Voice Conversion to Evaluate the Intelligibility of L2 Speech Using a Native Speaker's Shadowings

Autor: Geng, Haopeng, Saito, Daisuke, Minematsu, Nobuaki

Utterances by L2 speakers can be unintelligible due to mispronunciation and improper prosody. In computer-aided language learning systems, textual feedback is often provided using a speech recognition engine. However, an ideal form of feedback for L2

Externí odkaz: http://arxiv.org/abs/2410.02239

Zobrazit plný text záznamu

Report

Simulating Native Speaker Shadowing for Nonnative Speech Assessment with Latent Speech Representations

Autor: Geng, Haopeng, Saito, Daisuke, Minematsu, Nobuaki

Evaluating speech intelligibility is a critical task in computer-aided language learning systems. Traditional methods often rely on word error rates (WER) provided by automatic speech recognition (ASR) as intelligibility scores. However, this approac

Externí odkaz: http://arxiv.org/abs/2409.11742

Zobrazit plný text záznamu

Report

A Pilot Study of GSLM-based Simulation of Foreign Accentuation Only Using Native Speech Corpora

Autor: Onda, Kentaro, Park, Joonyong, Minematsu, Nobuaki, Saito, Daisuke

We propose a method of simulating the human process of foreign accentuation using Generative Spoken Language Model (GSLM) only with native speech corpora. When one listens to spoken words of a foreign language and repeats them, the repeated speech is

Externí odkaz: http://arxiv.org/abs/2407.11370

Zobrazit plný text záznamu

Report

Exploring Isolated Musical Notes as Pre-training Data for Predominant Instrument Recognition in Polyphonic Music

Autor: Zhong, Lifan, Cooper, Erica, Yamagishi, Junichi, Minematsu, Nobuaki

With the growing amount of musical data available, automatic instrument recognition, one of the essential problems in Music Information Retrieval (MIR), is drawing more and more attention. While automatic recognition of single instruments has been we

Externí odkaz: http://arxiv.org/abs/2306.08850

Zobrazit plný text záznamu

Report

Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition

Autor: Liu, Qianying, Gong, Zhuo, Yang, Zhengdong, Yang, Yuhang, Li, Sheng, Ding, Chenchen, Minematsu, Nobuaki, Huang, Hao, Cheng, Fei, Chu, Chenhui, Kurohashi, Sadao

Low-resource speech recognition has been long-suffering from insufficient training data. In this paper, we propose an approach that leverages neighboring languages to improve low-resource scenario performance, founded on the hypothesis that similar l

Externí odkaz: http://arxiv.org/abs/2204.03855

Zobrazit plný text záznamu

Report

Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder

Autor: Zhao, Yi, Takaki, Shinji, Luong, Hieu-Thi, Yamagishi, Junichi, Saito, Daisuke, Minematsu, Nobuaki

Recent neural networks such as WaveNet and sampleRNN that learn directly from speech waveform samples have achieved very high-quality synthetic speech in terms of both naturalness and speaker similarity even in multi-speaker text-to-speech synthesis

Externí odkaz: http://arxiv.org/abs/1807.11679

Zobrazit plný text záznamu