Výsledky vyhledávání

Report

FASSILA: A Corpus for Algerian Dialect Fake News Detection and Sentiment Analysis

Autor: Abdedaiem, Amin, Dahou, Abdelhalim Hafedh, Cheragui, Mohamed Amine, Mathiak, Brigitte

Publikováno v: Procedia Computer Science, 244, 397-407 (2024)

In the context of low-resource languages, the Algerian dialect (AD) faces challenges due to the absence of annotated corpora, hindering its effective processing, notably in Machine Learning (ML) applications reliant on corpora for training and assess

Externí odkaz: http://arxiv.org/abs/2411.04604

Zobrazit plný text záznamu

Report

From Unimodal to Multimodal: Scaling up Projectors to Align Modalities

Autor: Maniparambil, Mayug, Akshulakov, Raiymbek, Djilali, Yasser Abdelaziz Dahou, Narayan, Sanath, Singh, Ankit, O'Connor, Noel E.

Recent contrastive multimodal vision-language models like CLIP have demonstrated robust open-world semantic understanding, becoming the standard image backbones for vision-language applications due to their aligned latent space. However, this practic

Externí odkaz: http://arxiv.org/abs/2409.19425

Zobrazit plný text záznamu

Report

Falcon2-11B Technical Report

Autor: Malartic, Quentin, Chowdhury, Nilabhra Roy, Cojocaru, Ruxandra, Farooq, Mugariya, Campesan, Giulia, Djilali, Yasser Abdelaziz Dahou, Narayan, Sanath, Singh, Ankit, Velikanov, Maksim, Boussaha, Basma El Amel, Al-Yafeai, Mohammed, Alobeidli, Hamza, Qadi, Leen Al, Seddik, Mohamed El Amine, Fedyanin, Kirill, Alami, Reda, Hacid, Hakim

We introduce Falcon2-11B, a foundation model trained on over five trillion tokens, and its multimodal counterpart, Falcon2-11B-vlm, which is a vision-to-text model. We report our findings during the training of the Falcon2-11B which follows a multi-s

Externí odkaz: http://arxiv.org/abs/2407.14885

Zobrazit plný text záznamu

Report

ViSpeR: Multilingual Audio-Visual Speech Recognition

Autor: Narayan, Sanath, Djilali, Yasser Abdelaziz Dahou, Singh, Ankit, Bihan, Eustache Le, Hacid, Hakim

This work presents an extensive and detailed study on Audio-Visual Speech Recognition (AVSR) for five widely spoken languages: Chinese, Spanish, English, Arabic, and French. We have collected large-scale datasets for each language except for English,

Externí odkaz: http://arxiv.org/abs/2406.00038

Zobrazit plný text záznamu

Report

On the Anatomy of Real-World R Code for Static Analysis

Autor: Sihler, Florian, Pietzschmann, Lukas, Straub, Raphael, Tichy, Matthias, Diera, Andor, Dahou, Abdelhalim

CONTEXT The R programming language has a huge and active community, especially in the area of statistical computing. Its interpreted nature allows for several interesting constructs, like the manipulation of functions at run-time, that hinder the sta

Externí odkaz: http://arxiv.org/abs/2401.16228

Zobrazit plný text záznamu

Report

Do Vision and Language Encoders Represent the World Similarly?

Autor: Maniparambil, Mayug, Akshulakov, Raiymbek, Djilali, Yasser Abdelaziz Dahou, Narayan, Sanath, Seddik, Mohamed El Amine, Mangalam, Karttikeya, O'Connor, Noel E.

Aligned text-image encoders such as CLIP have become the de facto model for vision-language tasks. Furthermore, modality-specific encoders achieve impressive performances in their respective domains. This raises a central question: does an alignment

Externí odkaz: http://arxiv.org/abs/2401.05224

Zobrazit plný text záznamu

Report

Pneumonia Detection on chest X-ray images Using Ensemble of Deep Convolutional Neural Networks

Autor: Mabrouk, Alhassan, Redondo, Rebeca P. Díaz, Dahou, Abdelghani, Elaziz, Mohamed Abd, Kayed, Mohammed

Publikováno v: Applied Sciences, 2022, vol. 12, no 13, p. 6448

Pneumonia is a life-threatening lung infection resulting from several different viral infections. Identifying and treating pneumonia on chest X-ray images can be difficult due to its similarity to other pulmonary diseases. Thus, the existing methods

Externí odkaz: http://arxiv.org/abs/2312.07965

Zobrazit plný text záznamu

Report

Medical Image Classification Using Transfer Learning and Chaos Game Optimization on the Internet of Medical Things

Autor: Mabrouk, Alhassan, Dahou, Abdelghani, Elaziz, Mohamed Abd, Redondo, Rebeca P. Díaz, Kayed, Mohammed

Publikováno v: Computational Intelligence and Neuroscience, 2022, vol. 2022

The Internet of Medical Things (IoMT) has dramatically benefited medical professionals that patients and physicians can access from all regions. Although the automatic detection and prediction of diseases such as melanoma and leukemia is still being

Externí odkaz: http://arxiv.org/abs/2312.07437

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

Learning Saliency From Fixations

Autor: Djilali, Yasser Abdelaziz Dahou, McGuiness, Kevin, O'Connor, Noel

We present a novel approach for saliency prediction in images, leveraging parallel decoding in transformers to learn saliency solely from fixation maps. Models typically rely on continuous saliency maps, to overcome the difficulty of optimizing for t

Externí odkaz: http://arxiv.org/abs/2311.14073

Zobrazit plný text záznamu

Report

Do VSR Models Generalize Beyond LRS3?

Autor: Djilali, Yasser Abdelaziz Dahou, Narayan, Sanath, Bihan, Eustache Le, Boussaid, Haithem, Almazrouei, Ebtessam, Debbah, Merouane

The Lip Reading Sentences-3 (LRS3) benchmark has primarily been the focus of intense research in visual speech recognition (VSR) during the last few years. As a result, there is an increased risk of overfitting to its excessively used test set, which

Externí odkaz: http://arxiv.org/abs/2311.14063

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání