Zobrazeno 1 - 10
of 3 159
pro vyhledávání: '"Yuille, A."'
Diffusion models, and their generalization, flow matching, have had a remarkable impact on the field of media generation. Here, the conventional approach is to learn the complex mapping from a simple source distribution of Gaussian noise to the targe
Externí odkaz:
http://arxiv.org/abs/2412.15213
Autoregressive (AR) modeling has achieved remarkable success in natural language processing by enabling models to generate text with coherence and contextual understanding through next token prediction. Recently, in image generation, VAR proposes sca
Externí odkaz:
http://arxiv.org/abs/2412.15205
Autor:
Lu, Taiming, Shu, Tianmin, Xiao, Junfei, Ye, Luoxin, Wang, Jiahao, Peng, Cheng, Wei, Chen, Khashabi, Daniel, Chellappa, Rama, Yuille, Alan, Chen, Jieneng
Understanding, navigating, and exploring the 3D physical real world has long been a central challenge in the development of artificial intelligence. In this work, we take a step toward this goal by introducing GenEx, a system capable of planning comp
Externí odkaz:
http://arxiv.org/abs/2412.09624
3D spatial reasoning is the ability to analyze and interpret the positions, orientations, and spatial relationships of objects within the 3D space. This allows models to develop a comprehensive understanding of the 3D scene, enabling their applicabil
Externí odkaz:
http://arxiv.org/abs/2412.07825
In this work, we introduce a generative approach for pose-free reconstruction of $360^{\circ}$ scenes from a limited number of uncalibrated 2D images. Pose-free scene reconstruction from incomplete, unposed observations is usually regularized with de
Externí odkaz:
http://arxiv.org/abs/2411.15966
Autor:
Cai, Yuanhao, Zhang, He, Zhang, Kai, Liang, Yixun, Ren, Mengwei, Luan, Fujun, Liu, Qing, Kim, Soo Ye, Zhang, Jianming, Zhang, Zhifei, Zhou, Yuqian, Lin, Zhe, Yuille, Alan
Existing feed-forward image-to-3D methods mainly rely on 2D multi-view diffusion models that cannot guarantee 3D consistency. These methods easily collapse when changing the prompt view direction and mainly handle object-centric prompt images. In thi
Externí odkaz:
http://arxiv.org/abs/2411.14384
Planning with partial observation is a central challenge in embodied AI. A majority of prior works have tackled this challenge by developing agents that physically explore their environment to update their beliefs about the world state. In contrast,
Externí odkaz:
http://arxiv.org/abs/2411.11844
There exists recent work in computer vision, named VAR, that proposes a new autoregressive paradigm for image generation. Diverging from the vanilla next-token prediction, VAR structurally reformulates the image generation into a coarse to fine next-
Externí odkaz:
http://arxiv.org/abs/2411.10433
Touchstone Benchmark: Are We on the Right Way for Evaluating AI Algorithms for Medical Segmentation?
Autor:
Bassi, Pedro R. A. S., Li, Wenxuan, Tang, Yucheng, Isensee, Fabian, Wang, Zifu, Chen, Jieneng, Chou, Yu-Cheng, Kirchhoff, Yannick, Rokuss, Maximilian, Huang, Ziyan, Ye, Jin, He, Junjun, Wald, Tassilo, Ulrich, Constantin, Baumgartner, Michael, Roy, Saikat, Maier-Hein, Klaus H., Jaeger, Paul, Ye, Yiwen, Xie, Yutong, Zhang, Jianpeng, Chen, Ziyang, Xia, Yong, Xing, Zhaohu, Zhu, Lei, Sadegheih, Yousef, Bozorgpour, Afshin, Kumari, Pratibha, Azad, Reza, Merhof, Dorit, Shi, Pengcheng, Ma, Ting, Du, Yuxin, Bai, Fan, Huang, Tiejun, Zhao, Bo, Wang, Haonan, Li, Xiaomeng, Gu, Hanxue, Dong, Haoyu, Yang, Jichen, Mazurowski, Maciej A., Gupta, Saumya, Wu, Linshan, Zhuang, Jiaxin, Chen, Hao, Roth, Holger, Xu, Daguang, Blaschko, Matthew B., Decherchi, Sergio, Cavalli, Andrea, Yuille, Alan L., Zhou, Zongwei
How can we test AI performance? This question seems trivial, but it isn't. Standard benchmarks often have problems such as in-distribution and small-size test sets, oversimplified metrics, unfair comparisons, and short-term outcome pressure. As a con
Externí odkaz:
http://arxiv.org/abs/2411.03670
Autor:
Bassi, Pedro R. A. S., Wu, Qilong, Li, Wenxuan, Decherchi, Sergio, Cavalli, Andrea, Yuille, Alan, Zhou, Zongwei
As medical datasets rapidly expand, creating detailed annotations of different body structures becomes increasingly expensive and time-consuming. We consider that requesting radiologists to create detailed annotations is unnecessarily burdensome and
Externí odkaz:
http://arxiv.org/abs/2411.02753