Zobrazeno 1 - 10
of 29
pro vyhledávání: '"Kautz, Jan"'
Diffusion models have emerged as a key pillar of foundation models in visual domains. One of their critical applications is to universally solve different downstream inverse tasks via a single diffusion prior without re-training for each task. Most i
Externí odkaz:
http://arxiv.org/abs/2305.04391
Autor:
Lim, Jae Hyun, Kovachki, Nikola B., Baptista, Ricardo, Beckham, Christopher, Azizzadenesheli, Kamyar, Kossaifi, Jean, Voleti, Vikram, Song, Jiaming, Kreis, Karsten, Kautz, Jan, Pal, Christopher, Vahdat, Arash, Anandkumar, Anima
Diffusion models have recently emerged as a powerful framework for generative modeling. They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate samples by den
Externí odkaz:
http://arxiv.org/abs/2302.07400
Score-based generative models (SGMs) have recently demonstrated impressive results in terms of both sample quality and distribution coverage. However, they are usually applied directly in data space and often require thousands of network evaluations
Externí odkaz:
http://arxiv.org/abs/2106.05931
Variational autoencoders (VAEs) are one of the powerful likelihood-based generative models with applications in many domains. However, they struggle to generate high-quality images, especially when samples are obtained from the prior without any temp
Externí odkaz:
http://arxiv.org/abs/2010.02917
Energy-based models (EBMs) have recently been successful in representing complex distributions of small images. However, sampling from them requires expensive Markov chain Monte Carlo (MCMC) iterations that mix slowly in high dimensional pixel space.
Externí odkaz:
http://arxiv.org/abs/2010.00654
Autor:
Vahdat, Arash, Kautz, Jan
Normalizing flows, autoregressive models, variational autoencoders (VAEs), and deep energy-based models are among competing likelihood-based frameworks for deep generative learning. Among them, VAEs have the advantage of fast and tractable sampling a
Externí odkaz:
http://arxiv.org/abs/2007.03898
Phrase grounding, the problem of associating image regions to caption words, is a crucial component of vision-language tasks. We show that phrase grounding can be learned by optimizing word-region attention to maximize a lower bound on mutual informa
Externí odkaz:
http://arxiv.org/abs/2006.09920
Learning from spatio-temporal data has numerous applications such as human-behavior analysis, object tracking, video compression, and physics simulation.However, existing methods still perform poorly on challenging video tasks such as long-term forec
Externí odkaz:
http://arxiv.org/abs/2002.09131
Identifying the underlying directional relations from observational time series with nonlinear interactions and complex relational structures is key to a wide range of applications, yet remains a hard problem. In this work, we introduce a novel minim
Externí odkaz:
http://arxiv.org/abs/2001.01885
Autor:
Yin, Hongxu, Molchanov, Pavlo, Li, Zhizhong, Alvarez, Jose M., Mallya, Arun, Hoiem, Derek, Jha, Niraj K., Kautz, Jan
We introduce DeepInversion, a new method for synthesizing images from the image distribution used to train a deep neural network. We 'invert' a trained network (teacher) to synthesize class-conditional input images starting from random noise, without
Externí odkaz:
http://arxiv.org/abs/1912.08795