Zobrazeno 1 - 10
of 843
pro vyhledávání: '"Low-resource languages"'
Autor:
Arailym Tleubayeva, Aday Shomanov
Publikováno v:
Scientific Journal of Astana IT University, Pp 89-97 (2024)
This paper presents a comparative analysis of large pretrained multilingual models for question-answering (QA) systems, with a specific focus on their adaptation to the Kazakh language. The study evaluates models including mBERT, XLM-R, mT5, AYA, and
Externí odkaz:
https://doaj.org/article/3543bae418d44932a958e712361e99ab
Publikováno v:
Data in Brief, Vol 57, Iss , Pp 110967- (2024)
Sentiment analysis is an essential task that involves the extraction, identification, characterization, and classification of textual data to understand and categorize the attitudes and opinions expressed by individuals. While other languages have ex
Externí odkaz:
https://doaj.org/article/341cd0060b8d4573b9e217da43d815e6
Publikováno v:
Systems and Soft Computing, Vol 6, Iss , Pp 200112- (2024)
In today's digital era, social media has become a new tool for communication and sharing information, with the availability of high-speed internet it tends to reach the masses much faster. Lack of regulations and ethics have made advancement in the p
Externí odkaz:
https://doaj.org/article/ce88c723ec844c3a9b77066602116665
Publikováno v:
Frontiers in Artificial Intelligence, Vol 7 (2024)
The data-hungry statistical machine translation (SMT) and neural machine translation (NMT) models offer state-of-the-art results for languages with abundant data resources. However, extensive research is imperative to make these models perform equall
Externí odkaz:
https://doaj.org/article/e3acad548ae043aa92cc8cad027e5f69
Autor:
Aida Halitaj, Arkaitz Zubiaga
Publikováno v:
Natural Language Processing Journal, Vol 8, Iss , Pp 100093- (2024)
Authoritative citations are critical to ensure information integrity, especially in encyclopedias like Wikipedia. To date, research on automating citation worthiness detection has largely focused on the most resourceful language, English Wikipedia, n
Externí odkaz:
https://doaj.org/article/40d47856befc4dfa8ec8b8215d386406
Autor:
Noor Mairukh Khan Arnob, A. Faiyaz, Md Mubtasim Fuad, Shah Murtaza Rashid Al Masud, Baivab Das, M.F. Mridha
Publikováno v:
Data in Brief, Vol 55, Iss , Pp 110690- (2024)
The Languages of the Indian subcontinent are less represented in current NLP literature. To mitigate this gap, we present the IndicDialogue dataset, which contains subtitles and dialogues in 10 major Indic languages: Hindi, Bengali, Marathi, Telugu,
Externí odkaz:
https://doaj.org/article/da600f40bc6345d8bb722b6eed7ef354
Autor:
Elisa S. Izrailova, Arslanbek V. Astemirov, Ayshat S. Badaeva, Zelimhan A. Sultanov, Salaudin M. Umarkhadzhiev, Mokhmad-Salekh L. Khekhaev, Madina L. Yasaeva
Publikováno v:
Naučno-tehničeskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki, Vol 24, Iss 1, Pp 41-50 (2024)
The problem of resolving the uncertainties associated with homonymy for the Chechen language has become especially relevant after the creation of speech synthesis systems. The main disadvantage of speech synthesizers in the Chechen language are err
Externí odkaz:
https://doaj.org/article/c4c8665138224fb8a587b5777d96f6ab
Publikováno v:
IEEE Access, Vol 12, Pp 158493-158504 (2024)
This paper proposes a novel meta-transfer learning method to improve automatic speech recognition (ASR) performance in low-resource languages. Nowadays, we are witnessing high interest in low-resource ASR tasks aiming at delivering feasible and relia
Externí odkaz:
https://doaj.org/article/1da48bcf3661442ba97f17b20110c40f
Autor:
Harisu Abdullahi Shehu, Kaloma Usman Majikumna, Aminu Bashir Suleiman, Stephen Luka, Md. Haidar Sharif, Rabie A. Ramadan, Huseyin Kusetogullari
Publikováno v:
IEEE Access, Vol 12, Pp 98900-98916 (2024)
Opinion mining has witnessed significant advancements in well-resourced languages. However, for low-resource languages, this landscape remains relatively unexplored. This paper addresses this gap by conducting a comprehensive investigation into senti
Externí odkaz:
https://doaj.org/article/d8a16f3adbfb47d4a7f1424b8e6afacf
Publikováno v:
IEEE Access, Vol 12, Pp 87323-87332 (2024)
Sarcasm detection in the Indonesian language poses a unique set of challenges due to the linguistic nuances and cultural specificities of the Indonesian social media landscape. Understanding the dynamics of sarcasm in this context requires a deep div
Externí odkaz:
https://doaj.org/article/046bd1a172ef404fbbdcb2d71f51e1ad