Výsledky vyhledávání - "Hong-Kwang J. Kuo"

Integrating Text Inputs For Training and Adapting RNN Transducer ASR Models

Autor: Samuel Thomas, Brian Kingsbury, George Saon, Hong-Kwang J. Kuo

Compared to hybrid automatic speech recognition (ASR) systems that use a modular architecture in which each component can be independently adapted to a new domain, recent end-to-end (E2E) ASR system are harder to customize due to their all-neural mon

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6644f6c216ee03e6d48cae1293cc200e
http://arxiv.org/abs/2202.13155

Zobrazit plný text záznamu

Towards Reducing the Need for Speech Training Data To Build Spoken Language Understanding Systems

Autor: Samuel Thomas, Hong-Kwang J. Kuo, Brian Kingsbury, George Saon

The lack of speech data annotated with labels required for spoken language understanding (SLU) is often a major hurdle in building end-to-end (E2E) systems that can directly process speech inputs. In contrast, large amounts of text data with suitable

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e4f94b4c88137733a0ebfbf85f9af626

Zobrazit plný text záznamu

Integrating Dialog History into End-to-End Spoken Language Understanding Systems

Autor: Zoltán Tüske, Sachindra Joshi, Samuel Thomas, Brian Kingsbury, Hong-Kwang J. Kuo, Jatin Ganhotra, George Saon

End-to-end spoken language understanding (SLU) systems that process human-human or human-computer interactions are often context independent and process each turn of a conversation independently. Spoken conversations on the other hand, are very much

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e3ea2a5d9d0e9befb147ae4ad8d44aa3
http://arxiv.org/abs/2108.08405

Zobrazit plný text záznamu

RNN Transducer Models for Spoken Language Understanding

Autor: Brian Kingsbury, Hong-Kwang J. Kuo, Zvi Kons, Ron Hoory, Gakuto Kurata, Samuel Thomas, Zoltán Tüske, George Saon

Publikováno v: ICASSP

We present a comprehensive study on building and adapting RNN transducer (RNN-T) models for spoken language understanding(SLU). These end-to-end (E2E) models are constructed in three practical settings: a case where verbatim transcripts are available

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f36e4ecc2425c283e66923ba50cedeeb
https://doi.org/10.1109/icassp39728.2021.9414029

Zobrazit plný text záznamu

End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features

Autor: Zoltán Tüske, Brian Kingsbury, Hong-Kwang J. Kuo, Samuel Thomas, Edmilson da Silva Morais

Publikováno v: ICASSP

Transformer networks and self-supervised pre-training have consistently delivered state-of-art results in the field of natural language processing (NLP); however, their merits in the field of spoken language understanding (SLU) still need further inv

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::a0e2f52c5ef493d0a61989a55a633f22
http://arxiv.org/abs/2011.08238

Zobrazit plný text záznamu

End-to-End Spoken Language Understanding Without Full Transcripts

Autor: Kartik Audhkhasi, Luis A. Lastras, Zoltán Tüske, Yinghui Huang, Zvi Kons, Samuel Thomas, Brian Kingsbury, Hong-Kwang J. Kuo, Gakuto Kurata, Ron Hoory

Publikováno v: INTERSPEECH

An essential component of spoken language understanding (SLU) is slot filling: representing the meaning of a spoken utterance using semantic entity labels. In this paper, we develop end-to-end (E2E) spoken language understanding systems that directly

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ae16fc2d5690be35db90fd0445ff9fdb
https://doi.org/10.21437/interspeech.2020-2924

Zobrazit plný text záznamu

The IBM 2016 English Conversational Telephone Speech Recognition System

Autor: George Saon, Hong-Kwang J. Kuo, Steven J. Rennie, Tom Sercu

Publikováno v: INTERSPEECH

We describe a collection of acoustic and language modeling techniques that lowered the word error rate of our English conversational telephone LVCSR system to a record 6.6% on the Switchboard subset of the Hub5 2000 evaluation testset. On the acousti

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9b47cf76418cefe869048f1b286bd5a5
https://doi.org/10.21437/interspeech.2016-1460

Zobrazit plný text záznamu

The IBM 2015 English Conversational Telephone Speech Recognition System

Autor: George Saon, Hong-Kwang J. Kuo, Michael Picheny, Steven J. Rennie

Publikováno v: INTERSPEECH

We describe the latest improvements to the IBM English conversational telephone speech recognition system. Some of the techniques that were found beneficial are: maxout networks with annealed dropout rates; networks with a very large number of output

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::471121fad3d36e074b9651b4cb90f76d

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání