Zobrazeno 1 - 10
of 201
pro vyhledávání: '"He Kaiming"'
Autor:
Fan, Lijie, Li, Tianhong, Qin, Siyang, Li, Yuanzhen, Sun, Chen, Rubinstein, Michael, Sun, Deqing, He, Kaiming, Tian, Yonglong
Scaling up autoregressive models in vision has not proven as beneficial as in large language models. In this work, we investigate this scaling problem in the context of text-to-image generation, focusing on two critical factors: whether models use di
Externí odkaz:
http://arxiv.org/abs/2410.13863
Publikováno v:
Neurips 2024
One of the roadblocks for training generalist robotic models today is heterogeneity. Previous robot learning methods often collect data to train with one specific embodiment for one task, which is expensive and prone to overfitting. This work studies
Externí odkaz:
http://arxiv.org/abs/2409.20537
Conventional wisdom holds that autoregressive models for image generation are typically accompanied by vector-quantized tokens. We observe that while a discrete-valued space can facilitate representing a categorical distribution, it is not a necessit
Externí odkaz:
http://arxiv.org/abs/2406.11838
Autor:
Guo, Minghao, Wang, Bohan, Ma, Pingchuan, Zhang, Tianyuan, Owens, Crystal Elaine, Gan, Chuang, Tenenbaum, Joshua B., He, Kaiming, Matusik, Wojciech
We present a computational framework that transforms single images into 3D physical objects. The visual geometry of a physical object in an image is determined by three orthogonal attributes: mechanical properties, external forces, and rest-shape geo
Externí odkaz:
http://arxiv.org/abs/2405.20510
We introduce TetSphere Splatting, a Lagrangian geometry representation designed for high-quality 3D shape modeling. TetSphere splatting leverages an underused yet powerful geometric primitive -- volumetric tetrahedral meshes. It represents 3D shapes
Externí odkaz:
http://arxiv.org/abs/2405.20283
A central challenge in quantum information science and technology is achieving real-time estimation and feedforward control of quantum systems. This challenge is compounded by the inherent inhomogeneity of quantum resources, such as qubit properties
Externí odkaz:
http://arxiv.org/abs/2405.16380
Autor:
Liu, Zhuang, He, Kaiming
We revisit the "dataset classification" experiment suggested by Torralba and Efros a decade ago, in the new era with large-scale, diverse, and hopefully less biased datasets as well as more capable neural network architectures. Surprisingly, we obser
Externí odkaz:
http://arxiv.org/abs/2403.08632
In this study, we examine the representation learning abilities of Denoising Diffusion Models (DDM) that were originally purposed for image generation. Our philosophy is to deconstruct a DDM, gradually transforming it into a classical Denoising Autoe
Externí odkaz:
http://arxiv.org/abs/2401.14404
Unconditional generation -- the problem of modeling data distribution without relying on human-annotated labels -- is a long-standing and fundamental challenge in generative models, creating a potential of learning from large-scale unlabeled data. In
Externí odkaz:
http://arxiv.org/abs/2312.03701
We present Fast Language-Image Pre-training (FLIP), a simple and more efficient method for training CLIP. Our method randomly masks out and removes a large portion of image patches during training. Masking allows us to learn from more image-text pair
Externí odkaz:
http://arxiv.org/abs/2212.00794