Zobrazeno 1 - 8
of 8
pro vyhledávání: '"Ashual, Oron"'
Autor:
Sheynin, Shelly, Polyak, Adam, Singer, Uriel, Kirstain, Yuval, Zohar, Amit, Ashual, Oron, Parikh, Devi, Taigman, Yaniv
Instruction-based image editing holds immense potential for a variety of applications, as it enables users to perform any editing operation using a natural language instruction. However, current models in this domain often struggle with accurately ex
Externí odkaz:
http://arxiv.org/abs/2311.10089
Autor:
Yu, Lili, Shi, Bowen, Pasunuru, Ramakanth, Muller, Benjamin, Golovneva, Olga, Wang, Tianlu, Babu, Arun, Tang, Binh, Karrer, Brian, Sheynin, Shelly, Ross, Candace, Polyak, Adam, Howes, Russell, Sharma, Vasu, Xu, Puxin, Tamoyan, Hovhannes, Ashual, Oron, Singer, Uriel, Li, Shang-Wen, Zhang, Susan, James, Richard, Ghosh, Gargi, Taigman, Yaniv, Fazel-Zarandi, Maryam, Celikyilmaz, Asli, Zettlemoyer, Luke, Aghajanyan, Armen
We present CM3Leon (pronounced "Chameleon"), a retrieval-augmented, token-based, decoder-only multi-modal language model capable of generating and infilling both text and images. CM3Leon uses the CM3 multi-modal architecture but additionally shows th
Externí odkaz:
http://arxiv.org/abs/2309.02591
Autor:
Singer, Uriel, Sheynin, Shelly, Polyak, Adam, Ashual, Oron, Makarov, Iurii, Kokkinos, Filippos, Goyal, Naman, Vedaldi, Andrea, Parikh, Devi, Johnson, Justin, Taigman, Yaniv
We present MAV3D (Make-A-Video3D), a method for generating three-dimensional dynamic scenes from text descriptions. Our approach uses a 4D dynamic Neural Radiance Field (NeRF), which is optimized for scene appearance, density, and motion consistency
Externí odkaz:
http://arxiv.org/abs/2301.11280
Autor:
Singer, Uriel, Polyak, Adam, Hayes, Thomas, Yin, Xi, An, Jie, Zhang, Songyang, Hu, Qiyuan, Yang, Harry, Ashual, Oron, Gafni, Oran, Parikh, Devi, Gupta, Sonal, Taigman, Yaniv
We propose Make-A-Video -- an approach for directly translating the tremendous recent progress in Text-to-Image (T2I) generation to Text-to-Video (T2V). Our intuition is simple: learn what the world looks like and how it is described from paired text
Externí odkaz:
http://arxiv.org/abs/2209.14792
Autor:
Sheynin, Shelly, Ashual, Oron, Polyak, Adam, Singer, Uriel, Gafni, Oran, Nachmani, Eliya, Taigman, Yaniv
Recent text-to-image models have achieved impressive results. However, since they require large-scale datasets of text-image pairs, it is impractical to train them on new domains where data is scarce or not labeled. In this work, we propose using lar
Externí odkaz:
http://arxiv.org/abs/2204.02849
Recent text-to-image generation methods provide a simple yet exciting conversion capability between text and image domains. While these methods have incrementally improved the generated image fidelity and text relevancy, several pivotal gaps remain u
Externí odkaz:
http://arxiv.org/abs/2203.13131
The task of motion transfer between a source dancer and a target person is a special case of the pose transfer problem, in which the target person changes their pose in accordance with the motions of the dancer. In this work, we propose a novel metho
Externí odkaz:
http://arxiv.org/abs/2012.01158
Autor:
Ashual, Oron, Wolf, Lior
Publikováno v:
The IEEE International Conference on Computer Vision (ICCV), 2019
We introduce a method for the generation of images from an input scene graph. The method separates between a layout embedding and an appearance embedding. The dual embedding leads to generated images that better match the scene graph, have higher vis
Externí odkaz:
http://arxiv.org/abs/1909.05379