Zobrazeno 1 - 10
of 787
pro vyhledávání: '"An Tuytelaars"'
This paper proposes a self-learning framework to incrementally train (fine-tune) a personalized Keyword Spotting (KWS) model after the deployment on ultra-low power smart audio sensors. We address the fundamental problem of the absence of labeled tra
Externí odkaz:
http://arxiv.org/abs/2408.12481
Autor:
Wu, Minye, Tuytelaars, Tinne
Recent advancements in photo-realistic novel view synthesis have been significantly driven by Gaussian Splatting (3DGS). Nevertheless, the explicit nature of 3DGS data entails considerable storage requirements, highlighting a pressing need for more e
Externí odkaz:
http://arxiv.org/abs/2408.10041
Parameter-efficient fine-tuning (PEFT) methods are increasingly used with pre-trained language models (PLMs) for continual learning (CL). These methods involve training a PEFT module for each new task and using similarity-based selection to route mod
Externí odkaz:
http://arxiv.org/abs/2408.09053
Autor:
Stegmüller, Thomas, Lebailly, Tim, Dukic, Nikola, Bozorgtabar, Behzad, Tuytelaars, Tinne, Thiran, Jean-Philippe
Zero-shot classification capabilities naturally arise in models trained within a vision-language contrastive framework. Despite their classification prowess, these models struggle in dense tasks like zero-shot open-vocabulary segmentation. This defic
Externí odkaz:
http://arxiv.org/abs/2406.16085
Autor:
Hacohen, Guy, Tuytelaars, Tinne
Catastrophic forgetting poses a significant challenge in continual learning, where models often forget previous tasks when trained on new data. Our empirical analysis reveals a strong correlation between catastrophic forgetting and the learning speed
Externí odkaz:
http://arxiv.org/abs/2406.09935
Text-based semantic image editing assumes the manipulation of an image using a natural language instruction. Although recent works are capable of generating creative and qualitative images, the problem is still mostly approached as a black box sensit
Externí odkaz:
http://arxiv.org/abs/2404.18020
Autor:
Trusca, Maria Mihaela, Nuyts, Wolf, Thomm, Jonathan, Honig, Robert, Hofmann, Thomas, Tuytelaars, Tinne, Moens, Marie-Francine
Current diffusion models create photorealistic images given a text prompt as input but struggle to correctly bind attributes mentioned in the text to the right objects in the image. This is evidenced by our novel image-graph alignment model called EP
Externí odkaz:
http://arxiv.org/abs/2404.13766
Inverse rendering aims to reconstruct the scene properties of objects solely from multiview images. However, it is an ill-posed problem prone to producing ambiguous estimations deviating from physically accurate representations. In this paper, we uti
Externí odkaz:
http://arxiv.org/abs/2404.12819
This paper proposes DriViDOC: a framework for Driving from Vision through Differentiable Optimal Control, and its application to learn autonomous driving controllers from human demonstrations. DriViDOC combines the automatic inference of relevant fea
Externí odkaz:
http://arxiv.org/abs/2403.15102
In recent years, diffusion models have made remarkable strides in text-to-video generation, sparking a quest for enhanced control over video outputs to more accurately reflect user intentions. Traditional efforts predominantly focus on employing eith
Externí odkaz:
http://arxiv.org/abs/2403.10179