Zobrazeno 1 - 10
of 225
pro vyhledávání: '"Laaksonen Jorma"'
Autor:
Vayani, Ashmal, Dissanayake, Dinura, Watawana, Hasindri, Ahsan, Noor, Sasikumar, Nevasini, Thawakar, Omkar, Ademtew, Henok Biadglign, Hmaiti, Yahya, Kumar, Amandeep, Kuckreja, Kartik, Maslych, Mykola, Ghallabi, Wafa Al, Mihaylov, Mihail, Qin, Chao, Shaker, Abdelrahman M, Zhang, Mike, Ihsani, Mahardika Krisna, Esplana, Amiel, Gokani, Monil, Mirkin, Shachar, Singh, Harsh, Srivastava, Ashay, Hamerlik, Endre, Izzati, Fathinah Asma, Maani, Fadillah Adamsyah, Cavada, Sebastian, Chim, Jenny, Gupta, Rohit, Manjunath, Sanjay, Zhumakhanova, Kamila, Rabevohitra, Feno Heriniaina, Amirudin, Azril, Ridzuan, Muhammad, Kareem, Daniya, More, Ketan, Li, Kunyang, Shakya, Pramesh, Saad, Muhammad, Ghasemaghaei, Amirpouya, Djanibekov, Amirbek, Azizov, Dilshod, Jankovic, Branislava, Bhatia, Naman, Cabrera, Alvaro, Obando-Ceron, Johan, Otieno, Olympiah, Farestam, Fabian, Rabbani, Muztoba, Baliah, Sanoojan, Sanjeev, Santosh, Shtanchaev, Abduragim, Fatima, Maheen, Nguyen, Thao, Kareem, Amrin, Aremu, Toluwani, Xavier, Nathan, Bhatkal, Amit, Toyin, Hawau, Chadha, Aman, Cholakkal, Hisham, Anwer, Rao Muhammad, Felsberg, Michael, Laaksonen, Jorma, Solorio, Thamar, Choudhury, Monojit, Laptev, Ivan, Shah, Mubarak, Khan, Salman, Khan, Fahad
Existing Large Multimodal Models (LMMs) generally focus on only a few regions and languages. As LMMs continue to improve, it is increasingly important to ensure they understand cultural contexts, respect local sensitivities, and support low-resource
Externí odkaz:
http://arxiv.org/abs/2411.16508
Autor:
Ghaboura, Sara, Heakl, Ahmed, Thawakar, Omkar, Alharthi, Ali, Riahi, Ines, Saif, Abduljalil, Laaksonen, Jorma, Khan, Fahad S., Khan, Salman, Anwer, Rao M.
Recent years have witnessed a significant interest in developing large multimodal models (LMMs) capable of performing various visual reasoning and understanding tasks. This has led to the introduction of multiple LMM benchmarks to evaluate LMMs on di
Externí odkaz:
http://arxiv.org/abs/2410.18976
Autor:
Zheng, Peng, Gao, Dehong, Fan, Deng-Ping, Liu, Li, Laaksonen, Jorma, Ouyang, Wanli, Sebe, Nicu
We introduce a novel bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS). It comprises two essential components: the localization module (LM) and the reconstruction module (RM) with our proposed bilateral
Externí odkaz:
http://arxiv.org/abs/2401.03407
Conventional feature extraction techniques in the face anti-spoofing domain either analyze the entire video sequence or focus on a specific segment to improve model performance. However, identifying the optimal frames that provide the most valuable i
Externí odkaz:
http://arxiv.org/abs/2309.04958
With the growing availability of databases for face presentation attack detection, researchers are increasingly focusing on video-based face anti-spoofing methods that involve hundreds to thousands of images for training the models. However, there is
Externí odkaz:
http://arxiv.org/abs/2308.12364
Publikováno v:
SIGIR, 2023, 2261-2265
Vision-language (VL) Pre-training (VLP) has shown to well generalize VL models over a wide range of VL downstream tasks, especially for cross-modal retrieval. However, it hinges on a huge amount of image-text pairs, which requires tedious and costly
Externí odkaz:
http://arxiv.org/abs/2307.07341
Face presentation attacks (PA), also known as spoofing attacks, pose a substantial threat to biometric systems that rely on facial recognition systems, such as access control systems, mobile payments, and identity verification systems. To mitigate th
Externí odkaz:
http://arxiv.org/abs/2307.02858
Autor:
Thawkar, Omkar, Shaker, Abdelrahman, Mullappilly, Sahal Shaji, Cholakkal, Hisham, Anwer, Rao Muhammad, Khan, Salman, Laaksonen, Jorma, Khan, Fahad Shahbaz
The latest breakthroughs in large vision-language models, such as Bard and GPT-4, have showcased extraordinary abilities in performing a wide range of tasks. Such models are trained on massive datasets comprising billions of public image-text pairs w
Externí odkaz:
http://arxiv.org/abs/2306.07971
Autor:
Kumar, Amandeep, Bhunia, Ankan kumar, Narayan, Sanath, Cholakkal, Hisham, Anwer, Rao Muhammad, Laaksonen, Jorma, Khan, Fahad Shahbaz
In this work, we propose a few-shot colorectal tissue image generation method for addressing the scarcity of histopathological training data for rare cancer tissues. Our few-shot generation method, named XM-GAN, takes one base and a pair of reference
Externí odkaz:
http://arxiv.org/abs/2304.01992
Autor:
Thawakar, Omkar, Narayan, Sanath, Cholakkal, Hisham, Anwer, Rao Muhammad, Khan, Salman, Laaksonen, Jorma, Shah, Mubarak, Khan, Fahad Shahbaz
Existing video instance segmentation (VIS) approaches generally follow a closed-world assumption, where only seen category instances are identified and spatio-temporally segmented at inference. Open-world formulation relaxes the close-world static-le
Externí odkaz:
http://arxiv.org/abs/2304.01200