Zobrazeno 1 - 10
of 62
pro vyhledávání: '"Alam, Md Mahfuz"'
Kurdish, an Indo-European language spoken by over 30 million speakers, is considered a dialect continuum and known for its diversity in language varieties. Previous studies addressing language and speech technology for Kurdish handle it in a monolith
Externí odkaz:
http://arxiv.org/abs/2403.01983
It is relatively easy to mine a large parallel corpus for any machine learning task, such as speech-to-text or speech-to-speech translation. Although these mined corpora are large in volume, their quality is questionable. This work shows that the sim
Externí odkaz:
http://arxiv.org/abs/2402.01945
The availability of parallel texts is crucial to the performance of machine translation models. However, most of the world's languages face the predominant challenge of data scarcity. In this paper, we propose strategies to synthesize parallel data r
Externí odkaz:
http://arxiv.org/abs/2402.01939
Neural machine translation (NMT) systems exhibit limited robustness in handling source-side linguistic variations. Their performance tends to degrade when faced with even slight deviations in language usage, such as different domains or variations in
Externí odkaz:
http://arxiv.org/abs/2305.17267
We present BIG-C (Bemba Image Grounded Conversations), a large multimodal dataset for Bemba. While Bemba is the most populous language of Zambia, it exhibits a dearth of resources which render the development of language technologies or language proc
Externí odkaz:
http://arxiv.org/abs/2305.17202
Knowing the language of an input text/audio is a necessary first step for using almost every NLP tool such as taggers, parsers, or translation systems. Language identification is a well-studied problem, sometimes even considered solved; in reality, d
Externí odkaz:
http://arxiv.org/abs/2305.14263
This report describes GMU's sentiment analysis system for the SemEval-2023 shared task AfriSenti-SemEval. We participated in all three sub-tasks: Monolingual, Multilingual, and Zero-Shot. Our approach uses models initialized with AfroXLMR-large, a pr
Externí odkaz:
http://arxiv.org/abs/2304.12979
Question answering (QA) systems are now available through numerous commercial applications for a wide variety of domains, serving millions of users that interact with them via speech interfaces. However, current benchmarks in QA research do not accou
Externí odkaz:
http://arxiv.org/abs/2109.12072
Autor:
Alam, Md Mahfuz, Alam, Khondoker Mohammad, Momotaz, Rumana, Arifunnahar, Most, Rahman Bhuyin Apu, Md Mosiur, Siddique, Shaikh Sharmin
Publikováno v:
In Heliyon 30 June 2024 10(12)
Autor:
Alam, Md Mahfuz ibn, Anastasopoulos, Antonios, Besacier, Laurent, Cross, James, Gallé, Matthias, Koehn, Philipp, Nikoulina, Vassilina
As neural machine translation (NMT) systems become an important part of professional translator pipelines, a growing body of work focuses on combining NMT with terminologies. In many scenarios and particularly in cases of domain adaptation, one expec
Externí odkaz:
http://arxiv.org/abs/2106.11891