Zobrazeno 1 - 10
of 293
pro vyhledávání: '"Yang, Michael Ying"'
Interactive image segmentation enables users to interact minimally with a machine, facilitating the gradual refinement of the segmentation mask for a target of interest. Previous studies have demonstrated impressive performance in extracting a single
Externí odkaz:
http://arxiv.org/abs/2406.11472
Compositional 3D scene synthesis has diverse applications across a spectrum of industries such as robotics, films, and video games, as it closely mirrors the complexity of real-world multi-object environments. Conventional works typically employ shap
Externí odkaz:
http://arxiv.org/abs/2403.12848
Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for at
Externí odkaz:
http://arxiv.org/abs/2403.10452
Visual Question Answering (VQA) is a challenging task of predicting the answer to a question about the content of an image. It requires deep understanding of both the textual question and visual image. Prior works directly evaluate the answering mode
Externí odkaz:
http://arxiv.org/abs/2402.03896
Change detection plays a fundamental role in Earth observation for analyzing temporal iterations over time. However, recent studies have largely neglected the utilization of multimodal data that presents significant practical and technical advantages
Externí odkaz:
http://arxiv.org/abs/2310.09276
3D building generation with low data acquisition costs, such as single image-to-3D, becomes increasingly important. However, most of the existing single image-to-3D building creation works are restricted to those images with specific viewing angles,
Externí odkaz:
http://arxiv.org/abs/2309.00158
In real-world traffic scenarios, agents such as pedestrians and car drivers often observe neighboring agents who exhibit similar behavior as examples and then mimic their actions to some extent in their own behavior. This information can serve as pri
Externí odkaz:
http://arxiv.org/abs/2308.05634
Interactive image segmentation aims to segment the target from the background with the manual guidance, which takes as input multimodal data such as images, clicks, scribbles, and bounding boxes. Recently, vision transformers have achieved a great su
Externí odkaz:
http://arxiv.org/abs/2307.02280
Learning similarity between scene graphs and images aims to estimate a similarity score given a scene graph and an image. There is currently no research dedicated to this task, although it is critical for scene graph generation and downstream applica
Externí odkaz:
http://arxiv.org/abs/2304.00590
Autor:
Liu, Mengmeng, Cheng, Hao, Chen, Lin, Broszio, Hellward, Li, Jiangtao, Zhao, Runjiang, Sester, Monika, Yang, Michael Ying
Trajectory prediction for autonomous driving must continuously reason the motion stochasticity of road agents and comply with scene constraints. Existing methods typically rely on one-stage trajectory prediction models, which condition future traject
Externí odkaz:
http://arxiv.org/abs/2302.13933