Zobrazeno 1 - 10
of 1 257
pro vyhledávání: '"SHAKEEL, MUHAMMAD"'
Contextualized end-to-end automatic speech recognition has been an active research area, with recent efforts focusing on the implicit learning of contextual phrases based on the final loss objective. However, these approaches ignore the useful contex
Externí odkaz:
http://arxiv.org/abs/2406.16120
Autor:
Sudo, Yui, Shakeel, Muhammad, Fukumoto, Yosuke, Yan, Brian, Shi, Jiatong, Peng, Yifan, Watanabe, Shinji
End-to-end automatic speech recognition (E2E-ASR) can be classified into several network architectures, such as connectionist temporal classification (CTC), recurrent neural network transducer (RNN-T), attention-based encoder-decoder, and mask-predic
Externí odkaz:
http://arxiv.org/abs/2406.02950
End-to-end (E2E) automatic speech recognition (ASR) can operate in two modes: streaming and non-streaming, each with its pros and cons. Streaming ASR processes the speech frames in real-time as it is being received, while non-streaming ASR waits for
Externí odkaz:
http://arxiv.org/abs/2405.13514
Deep biasing (DB) enhances the performance of end-to-end automatic speech recognition (E2E-ASR) models for rare words or contextual phrases using a bias list. However, most existing methods treat bias phrases as sequences of subwords in a predefined
Externí odkaz:
http://arxiv.org/abs/2405.13344
Autor:
Ganesh, Vaishnevy, Drever, Sara, Agilinko, Joshua, Vallamkondu, Vamsidhar, Majumdar, Samit, Shakeel, Muhammad
Publikováno v:
GMS German Medical Science, Vol 19, p Doc10 (2021)
Background: Swallowed dentures can present with upper aerodigestive tract obstruction needing urgent intervention. Removing such an ingested denture can prove challenging and needs careful planning.Aim: To share our experience of managing patients wi
Externí odkaz:
https://doaj.org/article/9b4ae53e47884ee3bdce71dfef2c7bb6
Autor:
Agilinko, Joshua, Drever, Sara Katharine, Wai Low, Winston Kin, Shakeel, Muhammad, Hussain, Akhtar
Publikováno v:
GMS Interdisciplinary Plastic and Reconstructive Surgery DGPW, Vol 10, p Doc06 (2021)
Introduction: Pulsatile tinnitus (PT) can be very distressing for the patient. An identifiable abnormality is rarely detected. Dural AV malformation is responsible for arterial PT. Venous PT has rarely been attributed to an obvious abnormality on ven
Externí odkaz:
https://doaj.org/article/31a35451c2344bc6a050486c8cab7607
There has been an increasing interest in large speech models that can perform multiple tasks in a single model. Such models usually adopt an encoder-decoder or decoder-only architecture due to their popularity and good performance in many domains. Ho
Externí odkaz:
http://arxiv.org/abs/2402.12654
Autor:
Peng, Yifan, Tian, Jinchuan, Chen, William, Arora, Siddhant, Yan, Brian, Sudo, Yui, Shakeel, Muhammad, Choi, Kwanghee, Shi, Jiatong, Chang, Xuankai, Jung, Jee-weon, Watanabe, Shinji
Recent studies have highlighted the importance of fully open foundation models. The Open Whisper-style Speech Model (OWSM) is an initial step towards reproducing OpenAI Whisper using public data and open-source toolkits. However, previous versions of
Externí odkaz:
http://arxiv.org/abs/2401.16658
End-to-end (E2E) automatic speech recognition (ASR) methods exhibit remarkable performance. However, since the performance of such methods is intrinsically linked to the context present in the training data, E2E-ASR methods do not perform as desired
Externí odkaz:
http://arxiv.org/abs/2401.10449
Autor:
Peng, Yifan, Tian, Jinchuan, Yan, Brian, Berrebbi, Dan, Chang, Xuankai, Li, Xinjian, Shi, Jiatong, Arora, Siddhant, Chen, William, Sharma, Roshan, Zhang, Wangyou, Sudo, Yui, Shakeel, Muhammad, Jung, Jee-weon, Maiti, Soumi, Watanabe, Shinji
Pre-training speech models on large volumes of data has achieved remarkable success. OpenAI Whisper is a multilingual multitask model trained on 680k hours of supervised speech data. It generalizes well to various speech recognition and translation b
Externí odkaz:
http://arxiv.org/abs/2309.13876