Zobrazeno 1 - 10
of 2 101
pro vyhledávání: '"Waheed, Abdul A."'
Understanding how speech foundation models capture non-verbal cues is crucial for improving their interpretability and adaptability across diverse tasks. In our work, we analyze several prominent models such as Whisper, Seamless, Wav2Vec, HuBERT, and
Externí odkaz:
http://arxiv.org/abs/2410.12948
Recent work on distilling Whisper's knowledge into small models using pseudo-labels shows promising performance while reducing the size by up to 50\%. This results in small, efficient, and dedicated models. However, a critical step of distillation fr
Externí odkaz:
http://arxiv.org/abs/2407.01257
Zero-shot multi-speaker text-to-speech (ZS-TTS) systems have advanced for English, however, it still lags behind due to insufficient resources. We address this gap for Arabic, a language of more than 450 million native speakers, by first adapting a s
Externí odkaz:
http://arxiv.org/abs/2406.16751
Arabic is known to present unique challenges for Automatic Speech Recognition (ASR). On one hand, its rich linguistic diversity and wide range of dialects complicate the development of robust, inclusive models. On the other, current multilingual ASR
Externí odkaz:
http://arxiv.org/abs/2406.04512
Autor:
Waheed, Abdul, Talafha, Bashar, Sullivan, Peter, Elmadany, AbdelRahim, Abdul-Mageed, Muhammad
Arabic is a complex language with many varieties and dialects spoken by over 450 millions all around the world. Due to the linguistic diversity and variations, it is challenging to build a robust and generalized ASR system for Arabic. In this work, w
Externí odkaz:
http://arxiv.org/abs/2310.11069
Publikováno v:
IEEE Access, 29 August 2023, Vol. 7, Electronic ISSN: 2169-3536, pp. 94945-94961, https://ieeexplore.ieee.org/document/10233857
The motion or out-of-focus effect in digital images is the main reason for the blurred regions in defocused-blurred images. It may adversely affect various image features such as texture, pixel, and region. Therefore, it is important to detect in-foc
Externí odkaz:
http://arxiv.org/abs/2311.12845
Autor:
Kadaoui, Karima, Magdy, Samar M., Waheed, Abdul, Khondaker, Md Tawkat Islam, El-Shangiti, Ahmed Oumar, Nagoudi, El Moatez Billah, Abdul-Mageed, Muhammad
Despite the purported multilingual proficiency of instruction-finetuned large language models (LLMs) such as ChatGPT and Bard, the linguistic inclusivity of these models remains insufficiently explored. Considering this constraint, we present a thoro
Externí odkaz:
http://arxiv.org/abs/2308.03051
Whisper, the recently developed multilingual weakly supervised model, is reported to perform well on multiple speech recognition benchmarks in both monolingual and multilingual settings. However, it is not clear how Whisper would fare under diverse c
Externí odkaz:
http://arxiv.org/abs/2306.02902
ChatGPT's emergence heralds a transformative phase in NLP, particularly demonstrated through its excellent performance on many English benchmarks. However, the model's efficacy across diverse linguistic contexts remains largely uncharted territory. T
Externí odkaz:
http://arxiv.org/abs/2305.14976
Large language models (LLMs) with instruction fine-tuning demonstrate superior generative capabilities. However, these models are resource-intensive. To alleviate this issue, we explore distilling knowledge from instruction-tuned LLMs into much small
Externí odkaz:
http://arxiv.org/abs/2304.14402