Zobrazeno 1 - 10
of 417
pro vyhledávání: '"Agha, Ali"'
This paper presents an analysis of biases in open-source Large Language Models (LLMs) across various genders, religions, and races. We introduce a methodology for generating a bias detection dataset using seven bias triggers: General Debate, Position
Externí odkaz:
http://arxiv.org/abs/2410.12499
Active learning (AL) optimizes data labeling efficiency by selecting the most informative instances for annotation. A key component in this procedure is an acquisition function that guides the selection process and identifies the suitable instances f
Externí odkaz:
http://arxiv.org/abs/2410.04275
Autor:
Arif, Samee, Arif, Taimoor, Haroon, Muhammad Saad, Khan, Aamina Jamal, Raza, Agha Ali, Athar, Awais
This paper introduces the concept of an education tool that utilizes Generative Artificial Intelligence (GenAI) to enhance storytelling for children. The system combines GenAI-driven narrative co-creation, text-to-speech conversion, and text-to-video
Externí odkaz:
http://arxiv.org/abs/2409.11261
This paper presents a comprehensive evaluation of Urdu Automatic Speech Recognition (ASR) models. We analyze the performance of three ASR model families: Whisper, MMS, and Seamless-M4T using Word Error Rate (WER), along with a detailed examination of
Externí odkaz:
http://arxiv.org/abs/2409.11252
This paper presents a novel methodology for generating synthetic Preference Optimization (PO) datasets using multi-agent workflows. We evaluate the effectiveness and potential of these workflows in automating and enhancing the dataset generation proc
Externí odkaz:
http://arxiv.org/abs/2408.08688
The Transformer architecture has revolutionized deep learning through its Self-Attention mechanism, which effectively captures contextual information. However, the memory footprint of Self-Attention presents significant challenges for long-sequence t
Externí odkaz:
http://arxiv.org/abs/2408.08454
In this paper, we compare general-purpose models, GPT-4-Turbo and Llama-3-8b, with special-purpose models--XLM-Roberta-large, mT5-large, and Llama-3-8b--that have been fine-tuned on specific tasks. We focus on seven classification and seven generatio
Externí odkaz:
http://arxiv.org/abs/2407.04459
Publikováno v:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 17237-17244, May 2024
This paper introduces UQA, a novel dataset for question answering and text comprehension in Urdu, a low-resource language with over 70 million native speakers. UQA is generated by translating the Stanford Question Answering Dataset (SQuAD2.0), a larg
Externí odkaz:
http://arxiv.org/abs/2405.01458
Active learning (AL) techniques reduce labeling costs for training neural machine translation (NMT) models by selecting smaller representative subsets from unlabeled data for annotation. Diversity sampling techniques select heterogeneous instances, w
Externí odkaz:
http://arxiv.org/abs/2403.09259
Autor:
Frey, Jonas, Patel, Manthan, Atha, Deegan, Nubert, Julian, Fan, David, Agha, Ali, Padgett, Curtis, Spieler, Patrick, Hutter, Marco, Khattak, Shehryar
Autonomous navigation at high speeds in off-road environments necessitates robots to comprehensively understand their surroundings using onboard sensing only. The extreme conditions posed by the off-road setting can cause degraded camera image qualit
Externí odkaz:
http://arxiv.org/abs/2402.19341