Zobrazeno 1 - 10
of 15 339
pro vyhledávání: '"Anwer, A."'
Autor:
Mullappilly, Sahal Shaji, Kurpath, Mohammed Irfan, Pieri, Sara, Alseiari, Saeed Yahya, Cholakkal, Shanavas, Aldahmani, Khaled, Khan, Fahad, Anwer, Rao, Khan, Salman, Baldwin, Timothy, Cholakkal, Hisham
This paper introduces BiMediX2, a bilingual (Arabic-English) Bio-Medical EXpert Large Multimodal Model (LMM) with a unified architecture that integrates text and visual modalities, enabling advanced image understanding and medical applications. BiMed
Externí odkaz:
http://arxiv.org/abs/2412.07769
Autor:
Vayani, Ashmal, Dissanayake, Dinura, Watawana, Hasindri, Ahsan, Noor, Sasikumar, Nevasini, Thawakar, Omkar, Ademtew, Henok Biadglign, Hmaiti, Yahya, Kumar, Amandeep, Kuckreja, Kartik, Maslych, Mykola, Ghallabi, Wafa Al, Mihaylov, Mihail, Qin, Chao, Shaker, Abdelrahman M, Zhang, Mike, Ihsani, Mahardika Krisna, Esplana, Amiel, Gokani, Monil, Mirkin, Shachar, Singh, Harsh, Srivastava, Ashay, Hamerlik, Endre, Izzati, Fathinah Asma, Maani, Fadillah Adamsyah, Cavada, Sebastian, Chim, Jenny, Gupta, Rohit, Manjunath, Sanjay, Zhumakhanova, Kamila, Rabevohitra, Feno Heriniaina, Amirudin, Azril, Ridzuan, Muhammad, Kareem, Daniya, More, Ketan, Li, Kunyang, Shakya, Pramesh, Saad, Muhammad, Ghasemaghaei, Amirpouya, Djanibekov, Amirbek, Azizov, Dilshod, Jankovic, Branislava, Bhatia, Naman, Cabrera, Alvaro, Obando-Ceron, Johan, Otieno, Olympiah, Farestam, Fabian, Rabbani, Muztoba, Baliah, Sanoojan, Sanjeev, Santosh, Shtanchaev, Abduragim, Fatima, Maheen, Nguyen, Thao, Kareem, Amrin, Aremu, Toluwani, Xavier, Nathan, Bhatkal, Amit, Toyin, Hawau, Chadha, Aman, Cholakkal, Hisham, Anwer, Rao Muhammad, Felsberg, Michael, Laaksonen, Jorma, Solorio, Thamar, Choudhury, Monojit, Laptev, Ivan, Shah, Mubarak, Khan, Salman, Khan, Fahad
Existing Large Multimodal Models (LMMs) generally focus on only a few regions and languages. As LMMs continue to improve, it is increasingly important to ensure they understand cultural contexts, respect local sensitivities, and support low-resource
Externí odkaz:
http://arxiv.org/abs/2411.16508
Autor:
Ghaboura, Sara, Heakl, Ahmed, Thawakar, Omkar, Alharthi, Ali, Riahi, Ines, Saif, Abduljalil, Laaksonen, Jorma, Khan, Fahad S., Khan, Salman, Anwer, Rao M.
Recent years have witnessed a significant interest in developing large multimodal models (LMMs) capable of performing various visual reasoning and understanding tasks. This has led to the introduction of multiple LMM benchmarks to evaluate LMMs on di
Externí odkaz:
http://arxiv.org/abs/2410.18976
Autor:
Awais, Muhammad, Alharthi, Ali Husain Salem Abdulla, Kumar, Amandeep, Cholakkal, Hisham, Anwer, Rao Muhammad
Significant progress has been made in advancing large multimodal conversational models (LMMs), capitalizing on vast repositories of image-text data available online. Despite this progress, these models often encounter substantial domain gaps, hinderi
Externí odkaz:
http://arxiv.org/abs/2410.08405
Recently, the Segment Anything Model (SAM) has demonstrated promising segmentation capabilities in a variety of downstream segmentation tasks. However in the context of universal medical image segmentation there exists a notable performance discrepan
Externí odkaz:
http://arxiv.org/abs/2410.04172
Autor:
Ishaq, Ayesha, Boudjoghra, Mohamed El Amine, Lahoud, Jean, Khan, Fahad Shahbaz, Khan, Salman, Cholakkal, Hisham, Anwer, Rao Muhammad
3D multi-object tracking plays a critical role in autonomous driving by enabling the real-time monitoring and prediction of multiple objects' movements. Traditional 3D tracking systems are typically constrained by predefined object categories, limiti
Externí odkaz:
http://arxiv.org/abs/2410.01678
Autor:
Nawaz, Umair, Awais, Muhammad, Gani, Hanan, Naseer, Muzammal, Khan, Fahad, Khan, Salman, Anwer, Rao Muhammad
Capitalizing on vast amount of image-text data, large-scale vision-language pre-training has demonstrated remarkable zero-shot capabilities and has been utilized in several applications. However, models trained on general everyday web-crawled data of
Externí odkaz:
http://arxiv.org/abs/2410.01407
Autor:
Noman, Mubashir, Ahsan, Noor, Naseer, Muzammal, Cholakkal, Hisham, Anwer, Rao Muhammad, Khan, Salman, Khan, Fahad Shahbaz
Large multimodal models (LMMs) have shown encouraging performance in the natural image domain using visual instruction tuning. However, these LMMs struggle to describe the content of remote sensing images for tasks such as image or region grounding,
Externí odkaz:
http://arxiv.org/abs/2409.16261
The prolific use of Large Language Models (LLMs) as an alternate knowledge base requires them to be factually consistent, necessitating both correctness and consistency traits for paraphrased queries. Recently, significant attempts have been made to
Externí odkaz:
http://arxiv.org/abs/2409.14065
Autor:
Li, Long, Liu, Nian, Zhang, Dingwen, Li, Zhongyu, Khan, Salman, Anwer, Rao, Cholakkal, Hisham, Han, Junwei, Khan, Fahad Shahbaz
Publikováno v:
ECCV2024
Inter-image association modeling is crucial for co-salient object detection. Despite satisfactory performance, previous methods still have limitations on sufficient inter-image association modeling. Because most of them focus on image feature optimiz
Externí odkaz:
http://arxiv.org/abs/2409.01021