Zobrazeno 1 - 10
of 35
pro vyhledávání: '"A Najdenkoska"'
Autor:
Najdenkoska, Ivona, Derakhshani, Mohammad Mahdi, Asano, Yuki M., van Noord, Nanne, Worring, Marcel, Snoek, Cees G. M.
We address the challenge of representing long captions in vision-language models, such as CLIP. By design these models are limited by fixed, absolute positional encodings, restricting inputs to a maximum of 77 tokens and hindering performance on task
Externí odkaz:
http://arxiv.org/abs/2410.10034
Vision-Language Models (VLMs) have shown remarkable capabilities in a large number of downstream tasks. Nonetheless, compositional image understanding remains a rather difficult task due to the object bias present in training data. In this work, we i
Externí odkaz:
http://arxiv.org/abs/2407.15487
Autor:
Najdenkoska, Ivona, Sinha, Animesh, Dubey, Abhimanyu, Mahajan, Dhruv, Ramanathan, Vignesh, Radenovic, Filip
We propose Context Diffusion, a diffusion-based framework that enables image generation models to learn from visual examples presented in context. Recent work tackles such in-context learning for image generation, where a query image is provided alon
Externí odkaz:
http://arxiv.org/abs/2312.03584
Autor:
Derakhshani, Mohammad Mahdi, Najdenkoska, Ivona, Snoek, Cees G. M., Worring, Marcel, Asano, Yuki M.
We present Self-Context Adaptation (SeCAt), a self-supervised approach that unlocks few-shot abilities for open-ended classification with small visual language models. Our approach imitates image captions in a self-supervised way based on clustering
Externí odkaz:
http://arxiv.org/abs/2310.00500
Autor:
van Sonsbeek, Tom, Derakhshani, Mohammad Mahdi, Najdenkoska, Ivona, Snoek, Cees G. M., Worring, Marcel
Medical Visual Question Answering (VQA) is an important challenge, as it would lead to faster and more accurate diagnoses and treatment decisions. Most existing methods approach it as a multi-class classification problem, which restricts the outcome
Externí odkaz:
http://arxiv.org/abs/2303.05977
Multimodal few-shot learning is challenging due to the large domain gap between vision and language modalities. Existing methods are trying to communicate visual concepts as prompts to frozen language models, but rely on hand-engineered task inductio
Externí odkaz:
http://arxiv.org/abs/2302.14794
Autor:
Derakhshani, Mohammad Mahdi, Najdenkoska, Ivona, van Sonsbeek, Tom, Zhen, Xiantong, Mahapatra, Dwarikanath, Worring, Marcel, Snoek, Cees G. M.
Deep learning models have shown a great effectiveness in recognition of findings in medical images. However, they cannot handle the ever-changing clinical environment, bringing newly annotated medical data from different sources. To exploit the incom
Externí odkaz:
http://arxiv.org/abs/2204.05737
Automating report generation for medical imaging promises to reduce workload and assist diagnosis in clinical practice. Recent work has shown that deep learning models can successfully caption natural images. However, learning from medical data is ch
Externí odkaz:
http://arxiv.org/abs/2107.07314
Publikováno v:
In Medical Image Analysis November 2022 82
Autor:
Anita Najdenkoska, Zorica Arsova-Sarafinovska, Lenche Velkovska-Markovska, Mirjana S. Jankulovska, Claudia Zoani
Publikováno v:
Macedonian Journal of Chemistry and Chemical Engineering, Vol 39, Iss 2, Pp 287-288 (2021)
METROFOOD-PP project represents the “Preparatory Phase” of METROFOOD-RI - Infrastructure for Promoting Metrology in Food and Nutrition that has received funding from the European Union’s Horizon 2020 research and innovation programme under Gran
Externí odkaz:
https://doaj.org/article/52664b407bf74aecb2eb3cab652ba76d