Výsledky vyhledávání - "Tonioni, Alessio"

Report

BRAVE: Broadening the visual encoding of vision-language models

Autor: Kar, Oğuzhan Fatih, Tonioni, Alessio, Poklukar, Petra, Kulshrestha, Achin, Zamir, Amir, Tombari, Federico

Vision-language models (VLMs) are typically composed of a vision encoder, e.g. CLIP, and a language model (LM) that interprets the encoded features to solve downstream tasks. Despite remarkable progress, VLMs are subject to several shortcomings due t

Externí odkaz: http://arxiv.org/abs/2404.07204

Zobrazit plný text záznamu

Report

Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces

Autor: Comi, Mauro, Tonioni, Alessio, Yang, Max, Tremblay, Jonathan, Blukis, Valts, Lin, Yijiong, Lepora, Nathan F., Aitchison, Laurence

Touch and vision go hand in hand, mutually enhancing our ability to understand the world. From a research perspective, the problem of mixing touch and vision is underexplored and presents interesting challenges. To this end, we propose Tactile-Inform

Externí odkaz: http://arxiv.org/abs/2403.20275

Zobrazit plný text záznamu

Report

InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes

Autor: Shahbazi, Mohamad, Claessens, Liesbeth, Niemeyer, Michael, Collins, Edo, Tonioni, Alessio, Van Gool, Luc, Tombari, Federico

We introduce InseRF, a novel method for generative object insertion in the NeRF reconstructions of 3D scenes. Based on a user-provided textual description and a 2D bounding box in a reference viewpoint, InseRF generates new objects in 3D scenes. Rece

Externí odkaz: http://arxiv.org/abs/2401.05335

Zobrazit plný text záznamu

Report

Text-Conditioned Resampler For Long Form Video Understanding

Autor: Korbar, Bruno, Xian, Yongqin, Tonioni, Alessio, Zisserman, Andrew, Tombari, Federico

In this paper we present a text-conditioned video resampler (TCR) module that uses a pre-trained and frozen visual encoder and large language model (LLM) to process long video sequences for a task. TCR localises relevant visual features from the vide

Externí odkaz: http://arxiv.org/abs/2312.11897

Zobrazit plný text záznamu

Report

LIME: Localized Image Editing via Attention Regularization in Diffusion Models

Autor: Simsar, Enis, Tonioni, Alessio, Xian, Yongqin, Hofmann, Thomas, Tombari, Federico

Diffusion models (DMs) have gained prominence due to their ability to generate high-quality, varied images, with recent advancements in text-to-image generation. The research focus is now shifting towards the controllability of DMs. A significant cha

Externí odkaz: http://arxiv.org/abs/2312.09256

Zobrazit plný text záznamu

Report

TouchSDF: A DeepSDF Approach for 3D Shape Reconstruction using Vision-Based Tactile Sensing

Autor: Comi, Mauro, Lin, Yijiong, Church, Alex, Tonioni, Alessio, Aitchison, Laurence, Lepora, Nathan F.

Humans rely on their visual and tactile senses to develop a comprehensive 3D understanding of their physical environment. Recently, there has been a growing interest in exploring and manipulating objects using data-driven approaches that utilise high

Externí odkaz: http://arxiv.org/abs/2311.12602

Zobrazit plný text záznamu

Report

TextMesh: Generation of Realistic 3D Meshes From Text Prompts

Autor: Tsalicoglou, Christina, Manhardt, Fabian, Tonioni, Alessio, Niemeyer, Michael, Tombari, Federico

The ability to generate highly realistic 2D images from mere text prompts has recently made huge progress in terms of speed and quality, thanks to the advent of image diffusion models. Naturally, the question arises if this can be also achieved in th

Externí odkaz: http://arxiv.org/abs/2304.12439

Zobrazit plný text záznamu

Report

NeRF-Supervised Deep Stereo

Autor: Tosi, Fabio, Tonioni, Alessio, De Gregorio, Daniele, Poggi, Matteo

We introduce a novel framework for training deep stereo networks effortlessly and without any ground-truth. By leveraging state-of-the-art neural rendering solutions, we generate stereo training data from image sequences collected with a single handh

Externí odkaz: http://arxiv.org/abs/2303.17603

Zobrazit plný text záznamu

Report

NeRF-GAN Distillation for Efficient 3D-Aware Generation with Convolutions

Autor: Shahbazi, Mohamad, Ntavelis, Evangelos, Tonioni, Alessio, Collins, Edo, Paudel, Danda Pani, Danelljan, Martin, Van Gool, Luc

Pose-conditioned convolutional generative models struggle with high-quality 3D-consistent image generation from single-view datasets, due to their lack of sufficient 3D priors. Recently, the integration of Neural Radiance Fields (NeRFs) and generativ

Externí odkaz: http://arxiv.org/abs/2303.12865

Zobrazit plný text záznamu

Report

Learning Good Features to Transfer Across Tasks and Domains

Autor: Ramirez, Pierluigi Zama, Cardace, Adriano, De Luigi, Luca, Tonioni, Alessio, Salti, Samuele, Di Stefano, Luigi

Availability of labelled data is the major obstacle to the deployment of deep learning algorithms for computer vision tasks in new domains. The fact that many frameworks adopted to solve different tasks share the same architecture suggests that there

Externí odkaz: http://arxiv.org/abs/2301.11310

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání