Zobrazeno 1 - 10
of 26
pro vyhledávání: '"Yusuf, Bolaji"'
Autor:
Yusuf, Bolaji, Černocký, Jan "Honza", Saraçlar, Murat
End-to-end (E2E) keyword search (KWS) has emerged as an alternative and complimentary approach to conventional keyword search which depends on the output of automatic speech recognition (ASR) systems. While E2E methods greatly simplify the KWS pipeli
Externí odkaz:
http://arxiv.org/abs/2407.04652
This paper explores speculative speech recognition (SSR), where we empower conventional automatic speech recognition (ASR) with speculation capabilities, allowing the recognizer to run ahead of audio. We introduce a metric for measuring SSR performan
Externí odkaz:
http://arxiv.org/abs/2407.04641
Autor:
Yusuf, Bolaji, Saraçlar, Murat
Publikováno v:
in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 3213-3223, 2024
End-to-end (E2E) approaches to keyword search (KWS) are considerably simpler in terms of training and indexing complexity when compared to approaches which use the output of automatic speech recognition (ASR) systems. This simplification however has
Externí odkaz:
http://arxiv.org/abs/2407.04601
Publikováno v:
in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 3070-3080, 2023
Conventional keyword search systems operate on automatic speech recognition (ASR) outputs, which causes them to have a complex indexing and search pipeline. This has led to interest in ASR-free approaches to simplify the search procedure. We recently
Externí odkaz:
http://arxiv.org/abs/2308.08027
End-to-end speech recognition models are improved by incorporating external text sources, typically by fusion with an external language model. Such language models have to be retrained whenever the corpus of interest changes. Furthermore, since they
Externí odkaz:
http://arxiv.org/abs/2303.10942
Improving end-to-end speech recognition by incorporating external text data has been a longstanding research topic. There has been a recent focus on training E2E ASR models that get the performance benefits of external text data without incurring the
Externí odkaz:
http://arxiv.org/abs/2202.06045
Recently, neural approaches to spoken content retrieval have become popular. However, they tend to be restricted in their vocabulary or in their ability to deal with imbalanced test settings. These restrictions limit their applicability in keyword se
Externí odkaz:
http://arxiv.org/abs/2108.10357
Documenting languages helps to prevent the extinction of endangered dialects, many of which are otherwise expected to disappear by the end of the century. When documenting oral languages, unsupervised word segmentation (UWS) from speech is a useful,
Externí odkaz:
http://arxiv.org/abs/2106.04298
In this work, we propose a hierarchical subspace model for acoustic unit discovery. In this approach, we frame the task as one of learning embeddings on a low-dimensional phonetic subspace, and simultaneously specify the subspace itself as an embeddi
Externí odkaz:
http://arxiv.org/abs/2011.03115
Autor:
Yusuf, Bolaji, Ondel, Lucas
In this paper we describe our submission to the Zerospeech 2020 challenge, where the participants are required to discover latent representations from unannotated speech, and to use those representations to perform speech synthesis, with synthesis qu
Externí odkaz:
http://arxiv.org/abs/2005.09282