Výsledky vyhledávání

Report

Connecting Ideas in 'Lower-Resource' Scenarios: NLP for National Varieties, Creoles and Other Low-resource Scenarios

Autor: Joshi, Aditya, Kanojia, Diptesh, Lent, Heather, Kaing, Hour, Song, Haiyue

Despite excellent results on benchmarks over a small subset of languages, large language models struggle to process text from languages situated in `lower-resource' scenarios such as dialects/sociolects (national or social varieties of a language), C

Externí odkaz: http://arxiv.org/abs/2409.12683

Zobrazit plný text záznamu

Report

AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis

Autor: Bhosale, Swapnil, Yang, Haosen, Kanojia, Diptesh, Deng, Jiankang, Zhu, Xiatian

Novel view acoustic synthesis (NVAS) aims to render binaural audio at any target viewpoint, given a mono audio emitted by a sound source at a 3D scene. Existing methods have proposed NeRF-based implicit models to exploit visual cues as a condition fo

Externí odkaz: http://arxiv.org/abs/2406.08920

Zobrazit plný text záznamu

Report

Unsupervised Audio-Visual Segmentation with Modality Alignment

Autor: Bhosale, Swapnil, Yang, Haosen, Kanojia, Diptesh, Deng, Jiangkang, Zhu, Xiatian

Audio-Visual Segmentation (AVS) aims to identify, at the pixel level, the object in a visual scene that produces a given sound. Current AVS methods rely on costly fine-grained annotations of mask-audio pairs, making them impractical for scalability.

Externí odkaz: http://arxiv.org/abs/2403.14203

Zobrazit plný text záznamu

Report

Google Translate Error Analysis for Mental Healthcare Information: Evaluating Accuracy, Comprehensibility, and Implications for Multilingual Healthcare Communication

Autor: Delfani, Jaleh, Orasan, Constantin, Saadany, Hadeel, Temizoz, Ozlem, Taylor-Stilgoe, Eleanor, Kanojia, Diptesh, Braun, Sabine, Schouten, Barbara

This study explores the use of Google Translate (GT) for translating mental healthcare (MHealth) information and evaluates its accuracy, comprehensibility, and implications for multilingual healthcare communication through analysing GT output in the

Externí odkaz: http://arxiv.org/abs/2402.04023

Zobrazit plný text záznamu

Report

Airavata: Introducing Hindi Instruction-tuned LLM

Autor: Gala, Jay, Jayakumar, Thanmay, Husain, Jaavid Aktar, M, Aswanth Kumar, Khan, Mohammed Safi Ur Rahman, Kanojia, Diptesh, Puduppully, Ratish, Khapra, Mitesh M., Dabre, Raj, Murthy, Rudra, Kunchukuttan, Anoop

We announce the initial release of "Airavata," an instruction-tuned LLM for Hindi. Airavata was created by fine-tuning OpenHathi with diverse, instruction-tuning Hindi datasets to make it better suited for assistive tasks. Along with the model, we al

Externí odkaz: http://arxiv.org/abs/2401.15006

Zobrazit plný text záznamu

Report

Natural Language Processing for Dialects of a Language: A Survey

Autor: Joshi, Aditya, Dabre, Raj, Kanojia, Diptesh, Li, Zhuang, Zhan, Haolan, Haffari, Gholamreza, Dippold, Doris

State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets. This survey delves into an important attribute of these datasets: the dialect of a languag

Externí odkaz: http://arxiv.org/abs/2401.05632

Zobrazit plný text záznamu

Report

APE-then-QE: Correcting then Filtering Pseudo Parallel Corpora for MT Training Data Creation

Autor: Batheja, Akshay, Deoghare, Sourabh, Kanojia, Diptesh, Bhattacharyya, Pushpak

Automatic Post-Editing (APE) is the task of automatically identifying and correcting errors in the Machine Translation (MT) outputs. We propose a repair-filter-use methodology that uses an APE system to correct errors on the target side of the MT tra

Externí odkaz: http://arxiv.org/abs/2312.11312

Zobrazit plný text záznamu

Report

SurreyAI 2023 Submission for the Quality Estimation Shared Task

Autor: Sindhujan, Archchana, Kanojia, Diptesh, Orasan, Constantin, Ranasinghe, Tharindu

Quality Estimation (QE) systems are important in situations where it is necessary to assess the quality of translations, but there is no reference available. This paper describes the approach adopted by the SurreyAI team for addressing the Sentence-L

Externí odkaz: http://arxiv.org/abs/2312.00525

Zobrazit plný text záznamu

Report

CreoleVal: Multilingual Multitask Benchmarks for Creoles

Creoles represent an under-explored and marginalized group of languages, with few available resources for NLP research.While the genealogical ties between Creoles and a number of highly-resourced languages imply a significant potential for transfer l

Externí odkaz: http://arxiv.org/abs/2310.19567

Zobrazit plný text záznamu

Report

Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection

Autor: Bhosale, Swapnil, Chaudhuri, Abhra, Williams, Alex Lee Robert, Tiwari, Divyank, Dutta, Anjan, Zhu, Xiatian, Bhattacharyya, Pushpak, Kanojia, Diptesh

The introduction of the MUStARD dataset, and its emotion recognition extension MUStARD++, have identified sarcasm to be a multi-modal phenomenon -- expressed not only in natural language text, but also through manners of speech (like tonality and int

Externí odkaz: http://arxiv.org/abs/2310.01430

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání