Zobrazeno 1 - 10
of 1 764
pro vyhledávání: '"Aytar, A."'
In the rapidly evolving field of data science, efficiently navigating the expansive body of academic literature is crucial for informed decision-making and innovation. This paper presents an enhanced Retrieval-Augmented Generation (RAG) application,
Externí odkaz:
http://arxiv.org/abs/2412.15404
Autor:
Geng, Daniel, Herrmann, Charles, Hur, Junhwa, Cole, Forrester, Zhang, Serena, Pfaff, Tobias, Lopez-Guevara, Tatiana, Doersch, Carl, Aytar, Yusuf, Rubinstein, Michael, Sun, Chen, Wang, Oliver, Owens, Andrew, Sun, Deqing
Motion control is crucial for generating expressive and compelling video content; however, most existing video generation models rely mainly on text prompts for control, which struggle to capture the nuances of dynamic actions and temporal compositio
Externí odkaz:
http://arxiv.org/abs/2412.02700
We discuss some consistent issues on how RepNet has been evaluated in various papers. As a way to mitigate these issues, we report RepNet performance results on different datasets, and release evaluation code and the RepNet checkpoint to obtain these
Externí odkaz:
http://arxiv.org/abs/2411.08878
We introduce a dataset of annotations of temporal repetitions in videos. The dataset, OVR (pronounced as over), contains annotations for over 72K videos, with each annotation specifying the number of repetitions, the start and end time of the repetit
Externí odkaz:
http://arxiv.org/abs/2407.17085
Autor:
Wu, Ziyi, Rubanova, Yulia, Kabra, Rishabh, Hudson, Drew A., Gilitschenski, Igor, Aytar, Yusuf, van Steenkiste, Sjoerd, Allen, Kelsey R., Kipf, Thomas
We address the problem of multi-object 3D pose control in image diffusion models. Instead of conditioning on a sequence of text tokens, we propose to use a set of per-object representations, Neural Assets, to control the 3D pose of individual objects
Externí odkaz:
http://arxiv.org/abs/2406.09292
Autor:
Aytar, Dilsat Berin, Gunduc, Semra
Since technology is advancing so quickly in the modern era of information, data is becoming an essential resource in many fields. Correct data collection, organization, and analysis make it a potent tool for successful decision-making, process improv
Externí odkaz:
http://arxiv.org/abs/2405.16286
We introduce a versatile $\textit{flexible-captioning}$ vision-language model (VLM) capable of generating region-specific descriptions of varying lengths. The model, FlexCap, is trained to produce length-conditioned captions for input bounding boxes,
Externí odkaz:
http://arxiv.org/abs/2403.12026
Autor:
Bruce, Jake, Dennis, Michael, Edwards, Ashley, Parker-Holder, Jack, Shi, Yuge, Hughes, Edward, Lai, Matthew, Mavalankar, Aditi, Steigerwald, Richie, Apps, Chris, Aytar, Yusuf, Bechtle, Sarah, Behbahani, Feryal, Chan, Stephanie, Heess, Nicolas, Gonzalez, Lucy, Osindero, Simon, Ozair, Sherjil, Reed, Scott, Zhang, Jingwei, Zolna, Konrad, Clune, Jeff, de Freitas, Nando, Singh, Satinder, Rocktäschel, Tim
We introduce Genie, the first generative interactive environment trained in an unsupervised manner from unlabelled Internet videos. The model can be prompted to generate an endless variety of action-controllable virtual worlds described through text,
Externí odkaz:
http://arxiv.org/abs/2402.15391
Publikováno v:
Journal of Integrated Care, 2024, Vol. 32, Issue 5, pp. 135-148.
Externí odkaz:
http://www.emeraldinsight.com/doi/10.1108/JICA-08-2024-0044
Autor:
Carreira, João, King, Michael, Pătrăucean, Viorica, Gokay, Dilara, Ionescu, Cătălin, Yang, Yi, Zoran, Daniel, Heyward, Joseph, Doersch, Carl, Aytar, Yusuf, Damen, Dima, Zisserman, Andrew
We introduce a framework for online learning from a single continuous video stream -- the way people and animals learn, without mini-batches, data augmentation or shuffling. This poses great challenges given the high correlation between consecutive v
Externí odkaz:
http://arxiv.org/abs/2312.00598