Zobrazeno 1 - 10
of 70
pro vyhledávání: '"Kalantidis, Yannis"'
Autor:
Sariyildiz, Mert Bulent, Weinzaepfel, Philippe, Lucas, Thomas, Larlus, Diane, Kalantidis, Yannis
Pretrained models have become a commodity and offer strong results on a broad range of tasks. In this work, we focus on classification and seek to learn a unique encoder able to take from several complementary pretrained models. We aim at even strong
Externí odkaz:
http://arxiv.org/abs/2408.05088
Vision-Language Models (VLMs) have demonstrated impressive performance on zero-shot classification, i.e. classification when provided merely with a list of class names. In this paper, we tackle the case of zero-shot classification in the presence of
Externí odkaz:
http://arxiv.org/abs/2404.04072
Autor:
Kalantidis, Yannis, Sarıyıldız, Mert Bülent, Rezende, Rafael S., Weinzaepfel, Philippe, Larlus, Diane, Csurka, Gabriela
State-of-the-art visual localization approaches generally rely on a first image retrieval step whose role is crucial. Yet, retrieval often struggles when facing varying conditions, due to e.g. weather or time of day, with dramatic consequences on the
Externí odkaz:
http://arxiv.org/abs/2402.09237
Few-shot action recognition, i.e. recognizing new action classes given only a few examples, benefits from incorporating temporal information. Prior work either encodes such information in the representation itself and learns classifiers at test time,
Externí odkaz:
http://arxiv.org/abs/2303.16084
Recent image generation models such as Stable Diffusion have exhibited an impressive ability to generate fairly realistic images starting from a simple text prompt. Could such models render real images obsolete for training image prediction models? I
Externí odkaz:
http://arxiv.org/abs/2212.08420
Strong image search models can be learned for a specific domain, ie. set of labels, provided that some labeled images of that domain are available. A practical visual search model, however, should be versatile enough to solve multiple retrieval tasks
Externí odkaz:
http://arxiv.org/abs/2210.02254
Autor:
Baradel, Fabien, Brégier, Romain, Groueix, Thibault, Weinzaepfel, Philippe, Kalantidis, Yannis, Rogez, Grégory
Training state-of-the-art models for human pose estimation in videos requires datasets with annotations that are really hard and expensive to obtain. Although transformers have been recently utilized for body pose sequence modeling, related methods r
Externí odkaz:
http://arxiv.org/abs/2208.10211
We consider the problem of training a deep neural network on a given classification task, e.g., ImageNet-1K (IN1K), so that it excels at both the training task as well as at other (future) transfer tasks. These two seemingly contradictory properties
Externí odkaz:
http://arxiv.org/abs/2206.15369
Methods that combine local and global features have recently shown excellent performance on multiple challenging deep image retrieval benchmarks, but their use of local features raises at least two issues. First, these local features simply boil down
Externí odkaz:
http://arxiv.org/abs/2201.13182
Dimensionality reduction methods are unsupervised approaches which learn low-dimensional spaces where some properties of the initial space, typically the notion of "neighborhood", are preserved. Such methods usually require propagation on large k-NN
Externí odkaz:
http://arxiv.org/abs/2110.09455