Zobrazeno 1 - 10
of 810
pro vyhledávání: '"Aberman A"'
Autor:
Qian, Guocheng, Wang, Kuan-Chieh, Patashnik, Or, Heravi, Negin, Ostashev, Daniil, Tulyakov, Sergey, Cohen-Or, Daniel, Aberman, Kfir
We introduce Omni-ID, a novel facial representation designed specifically for generative tasks. Omni-ID encodes holistic information about an individual's appearance across diverse expressions and poses within a fixed-size representation. It consolid
Externí odkaz:
http://arxiv.org/abs/2412.09694
Face image restoration aims to enhance degraded facial images while addressing challenges such as diverse degradation types, real-time processing demands, and, most crucially, the preservation of identity-specific features. Existing methods often str
Externí odkaz:
http://arxiv.org/abs/2412.06753
Autor:
Avrahami, Omri, Patashnik, Or, Fried, Ohad, Nemchinov, Egor, Aberman, Kfir, Lischinski, Dani, Cohen-Or, Daniel
Diffusion models have revolutionized the field of content synthesis and editing. Recent models have replaced the traditional UNet architecture with the Diffusion Transformer (DiT), and employed flow-matching for improved training and sampling. Howeve
Externí odkaz:
http://arxiv.org/abs/2411.14430
Autor:
Gong, Yifan, Zhan, Zheng, Li, Yanyu, Idelbayev, Yerlan, Zharkov, Andrey, Aberman, Kfir, Tulyakov, Sergey, Wang, Yanzhi, Ren, Jian
Good weight initialization serves as an effective measure to reduce the training cost of a deep neural network (DNN) model. The choice of how to initialize parameters is challenging and may require manual tuning, which can be time-consuming and prone
Externí odkaz:
http://arxiv.org/abs/2407.11966
Autor:
Dravid, Amil, Gandelsman, Yossi, Wang, Kuan-Chieh, Abdal, Rameen, Wetzstein, Gordon, Efros, Alexei A., Aberman, Kfir
We investigate the space of weights spanned by a large collection of customized diffusion models. We populate this space by creating a dataset of over 60,000 models, each of which is a base model fine-tuned to insert a different person's visual ident
Externí odkaz:
http://arxiv.org/abs/2406.09413
We introduce a new architecture for personalization of text-to-image diffusion models, coined Mixture-of-Attention (MoA). Inspired by the Mixture-of-Experts mechanism utilized in large language models (LLMs), MoA distributes the generation workload b
Externí odkaz:
http://arxiv.org/abs/2404.11565
Text-to-image diffusion models have an unprecedented ability to generate diverse and high-quality images. However, they often struggle to faithfully capture the intended semantics of complex input prompts that include multiple subjects. Recently, num
Externí odkaz:
http://arxiv.org/abs/2403.16990
Recent large-scale vision-language models (VLMs) have demonstrated remarkable capabilities in understanding and generating textual descriptions for visual content. However, these models lack an understanding of user-specific concepts. In this work, w
Externí odkaz:
http://arxiv.org/abs/2403.14599
Autor:
Qian, Guocheng, Cao, Junli, Siarohin, Aliaksandr, Kant, Yash, Wang, Chaoyang, Vasilkovsky, Michael, Lee, Hsin-Ying, Fang, Yuwei, Skorokhodov, Ivan, Zhuang, Peiye, Gilitschenski, Igor, Ren, Jian, Ghanem, Bernard, Aberman, Kfir, Tulyakov, Sergey
We introduce Amortized Text-to-Mesh (AToM), a feed-forward text-to-mesh framework optimized across multiple text prompts simultaneously. In contrast to existing text-to-3D methods that often entail time-consuming per-prompt optimization and commonly
Externí odkaz:
http://arxiv.org/abs/2402.00867
Autor:
Gong, Yifan, Zhan, Zheng, Jin, Qing, Li, Yanyu, Idelbayev, Yerlan, Liu, Xian, Zharkov, Andrey, Aberman, Kfir, Tulyakov, Sergey, Wang, Yanzhi, Ren, Jian
One highly promising direction for enabling flexible real-time on-device image editing is utilizing data distillation by leveraging large-scale text-to-image diffusion models to generate paired datasets used for training generative adversarial networ
Externí odkaz:
http://arxiv.org/abs/2401.06127