Výsledky vyhledávání

Handling OOVWords in Mandarin Spoken Term Detection with an Hierarchical n‐Gram Language Model

Autor: Xingyu Na, Pengyuan Zhang, Xuyang Wang, Yonghong Yan, Jielin Pan

Publikováno v: Chinese Journal of Electronics. 26:1239-1244

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::84bd389eb8ac6714a72ff62c77646434
https://doi.org/10.1049/cje.2017.07.004

Zobrazit plný text záznamu

AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

Autor: Xingyu Na, Hui Bu, Jiayu Du, Bengu Wu, Hao Zheng

Publikováno v: O-COCOSDA

An open-source Mandarin speech corpus called AISHELL-1 is released. It is by far the largest corpus which is suitable for conducting the speech recognition research and building speech recognition systems for Mandarin. The recording procedure, includ

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e381080d2067cdf16bedba63ad26386e

Zobrazit plný text záznamu

Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI

Autor: Daniel Galvez, Daniel Povey, Sanjeev Khudanpur, Vijayaditya Peddinti, Xingyu Na, Yiming Wang, Pegah Ghahremani, Vimal Manohar

Publikováno v: INTERSPEECH

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::c60e6d0acaffa8f2fffa905280784ee9
https://doi.org/10.21437/interspeech.2016-595

Zobrazit plný text záznamu

An empirical exploration of CTC acoustic models

Autor: Yajie Miao, Tom Ko, Xingyu Na, Florian Metze, Alex Waibel, Mohammad Gowayyed

Publikováno v: ICASSP

The connectionist temporal classification (CTC) loss function has several interesting properties relevant for automatic speech recognition (ASR): applied on top of deep recurrent neural networks (RNNs), CTC learns the alignments between speech frames

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::a8957febbf2b077adfe3ae9d37d31fdb
https://doi.org/10.1109/icassp.2016.7472152

Zobrazit plný text záznamu

Computational Auditory Scene Analysis Based Voice Activity Detection

Autor: Xiang Xie, Xingyu Na, Ming Tu

Publikováno v: ICPR

Voice activity detection (VAD) is always important in many speech applications. In this paper, two VAD methods using novel features based on computational auditory scene analysis (CASA) are proposed. The first method is based on statistical model bas

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::2988a9bf0c435630bf5b207d4b7a3e15
https://doi.org/10.1109/icpr.2014.147

Zobrazit plný text záznamu

Low latency parameter generation for real-time speech synthesis system

Autor: Jingming Kuang, Xingyu Na, Xiang Xie

Publikováno v: ICME

Speech synthesizer is commonly used in human-computer interaction. In many applicational cases, the computing resource is limited while real-time synthesis is demanded. The HMM-based speech synthesis technique allows creating a natural voice quality

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::c54c15a65b72baa08ebd9abb6044207c
https://doi.org/10.1109/icme.2014.6890197

Zobrazit plný text záznamu

Improving voice quality of HMM-based speech synthesis using voice conversion method

Autor: Xiang Xie, Yishan Jiao, Xingyu Na, Ming Tu

Publikováno v: ICASSP

HMM-based speech synthesis system (HTS) often generates buzzy and muffled speech. Such degradation of voice quality makes synthetic speech sound robotically rather than naturally. From this point, we suppose that synthetic speech is in a different sp

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::6e6ac8193eb41421d8a555da88ba9695
https://doi.org/10.1109/icassp.2014.6855141

Zobrazit plný text záznamu

An improved tone labeling and prediction method with non-uniform segmentation of F0 contour

Autor: Yaling He, Xiang Xie, Xingyu Na, Jingming Kuang

Publikováno v: ISCSLP

This paper proposes a tone labeling technique for tonal language speech synthesis. Non-uniform segmentation using Viterbi alignment is introduced to determine the boundaries to get F0 symbols, which are used as tonal label to eliminate the mismatch b

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::e9803f868fe4d868039cbd8da92c07a8
https://doi.org/10.1109/iscslp.2012.6423467

Zobrazit plný text záznamu

Incremental Syllable-Context Phonetic Vocoding

Autor: Petr Motlicek, Xingyu Na, Milos Cernak, Philip N. Garner, Alexandros Lazaridis

Current very low bit rate speech coders are, due to complexity limitations, designed to work off-line. This paper investigates incremental speech coding that operates real-time and incrementally (i.e., encoded speech depends only on already-uttered s

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::467100fc6f42a89e9aad96e7e609a965
https://infoscience.epfl.ch/record/206809

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání