Výsledky vyhledávání

Report

With a Grain of SALT: Are LLMs Fair Across Social Dimensions?

Autor: Arif, Samee, Khan, Zohaib, Raza, Agha Ali, Athar, Awais

This paper presents an analysis of biases in open-source Large Language Models (LLMs) across various genders, religions, and races. We introduce a methodology for generating a bias detection dataset using seven bias triggers: General Debate, Position

Externí odkaz: http://arxiv.org/abs/2410.12499

Zobrazit plný text záznamu

Report

Language Model-Driven Data Pruning Enables Efficient Active Learning

Autor: Azeemi, Abdul Hameed, Qazi, Ihsan Ayyub, Raza, Agha Ali

Active learning (AL) optimizes data labeling efficiency by selecting the most informative instances for annotation. A key component in this procedure is an acquisition function that guides the selection process and identifies the suitable instances f

Externí odkaz: http://arxiv.org/abs/2410.04275

Zobrazit plný text záznamu

Report

The Art of Storytelling: Multi-Agent Generative AI for Dynamic Multimodal Narratives

Autor: Arif, Samee, Arif, Taimoor, Haroon, Muhammad Saad, Khan, Aamina Jamal, Raza, Agha Ali, Athar, Awais

This paper introduces the concept of an education tool that utilizes Generative Artificial Intelligence (GenAI) to enhance storytelling for children. The system combines GenAI-driven narrative co-creation, text-to-speech conversion, and text-to-video

Externí odkaz: http://arxiv.org/abs/2409.11261

Zobrazit plný text záznamu

Report

WER We Stand: Benchmarking Urdu ASR Models

Autor: Arif, Samee, Khan, Aamina Jamal, Abbas, Mustafa, Raza, Agha Ali, Athar, Awais

This paper presents a comprehensive evaluation of Urdu Automatic Speech Recognition (ASR) models. We analyze the performance of three ASR model families: Whisper, MMS, and Seamless-M4T using Word Error Rate (WER), along with a detailed examination of

Externí odkaz: http://arxiv.org/abs/2409.11252

Zobrazit plný text záznamu

Report

The Fellowship of the LLMs: Multi-Agent Workflows for Synthetic Preference Optimization Dataset Generation

Autor: Arif, Samee, Farid, Sualeha, Azeemi, Abdul Hameed, Athar, Awais, Raza, Agha Ali

This paper presents a novel methodology for generating synthetic Preference Optimization (PO) datasets using multi-agent workflows. We evaluate the effectiveness and potential of these workflows in automating and enhancing the dataset generation proc

Externí odkaz: http://arxiv.org/abs/2408.08688

Zobrazit plný text záznamu

Report

Beyond Uniform Query Distribution: Key-Driven Grouped Query Attention

Autor: Khan, Zohaib, Khaquan, Muhammad, Tafveez, Omer, Samiwala, Burhanuddin, Raza, Agha Ali

The Transformer architecture has revolutionized deep learning through its Self-Attention mechanism, which effectively captures contextual information. However, the memory footprint of Self-Attention presents significant challenges for long-sequence t

Externí odkaz: http://arxiv.org/abs/2408.08454

Zobrazit plný text záznamu

Report

Generalists vs. Specialists: Evaluating Large Language Models for Urdu

Autor: Arif, Samee, Azeemi, Abdul Hameed, Raza, Agha Ali, Athar, Awais

In this paper, we compare general-purpose models, GPT-4-Turbo and Llama-3-8b, with special-purpose models--XLM-Roberta-large, mT5-large, and Llama-3-8b--that have been fine-tuned on specific tasks. We focus on seven classification and seven generatio

Externí odkaz: http://arxiv.org/abs/2407.04459

Zobrazit plný text záznamu

Report

UQA: Corpus for Urdu Question Answering

Autor: Arif, Samee, Farid, Sualeha, Athar, Awais, Raza, Agha Ali

Publikováno v: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 17237-17244, May 2024

This paper introduces UQA, a novel dataset for question answering and text comprehension in Urdu, a low-resource language with over 70 million native speakers. UQA is generated by translating the Stanford Question Answering Dataset (SQuAD2.0), a larg

Externí odkaz: http://arxiv.org/abs/2405.01458

Zobrazit plný text záznamu

Report

To Label or Not to Label: Hybrid Active Learning for Neural Machine Translation

Autor: Azeemi, Abdul Hameed, Qazi, Ihsan Ayyub, Raza, Agha Ali

Active learning (AL) techniques reduce labeling costs for training neural machine translation (NMT) models by selecting smaller representative subsets from unlabeled data for annotation. Diversity sampling techniques select heterogeneous instances, w

Externí odkaz: http://arxiv.org/abs/2403.09259

Zobrazit plný text záznamu

Report

RoadRunner -- Learning Traversability Estimation for Autonomous Off-road Driving

Autor: Frey, Jonas, Patel, Manthan, Atha, Deegan, Nubert, Julian, Fan, David, Agha, Ali, Padgett, Curtis, Spieler, Patrick, Hutter, Marco, Khattak, Shehryar

Autonomous navigation at high speeds in off-road environments necessitates robots to comprehensively understand their surroundings using onboard sensing only. The extreme conditions posed by the off-road setting can cause degraded camera image qualit

Externí odkaz: http://arxiv.org/abs/2402.19341

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání