Zobrazeno 1 - 10
of 31 473
pro vyhledávání: '"Scene understanding"'
Autor:
Wang, Juan1 (AUTHOR) wang.juan.t4@dc.tohoku.ac.jp, Wang, Zhijie2 (AUTHOR) zhijie@vision.is.tohoku.ac.jp, Miyazaki, Tomo1 (AUTHOR) tomo@tohoku.ac.jp, Fan, Yaohou1 (AUTHOR) fan.yaohou.t4@dc.tohoku.ac.jp, Omachi, Shinichiro1 (AUTHOR) shinichiro.omachi.b5@tohoku.ac.jp
Publikováno v:
Sensors (14248220). Oct2024, Vol. 24 Issue 19, p6166. 13p.
Scene graphs have proven to be highly effective for various scene understanding tasks due to their compact and explicit representation of relational information. However, current methods often overlook the critical importance of preserving symmetry w
Externí odkaz:
http://arxiv.org/abs/2411.10509
Autor:
Aryan, FNU, Stepputtis, Simon, Bhagat, Sarthak, Campbell, Joseph, Lee, Kwonjoon, Mahjoub, Hossein Nourkhiz, Sycara, Katia
Scene understanding is a fundamental capability needed in many domains, ranging from question-answering to robotics. Unlike recent end-to-end approaches that must explicitly learn varying compositions of the same scene, our method reasons over their
Externí odkaz:
http://arxiv.org/abs/2410.22626
Multi-modal fusion has played a vital role in multi-modal scene understanding. Most existing methods focus on cross-modal fusion involving two modalities, often overlooking more complex multi-modal fusion, which is essential for real-world applicatio
Externí odkaz:
http://arxiv.org/abs/2410.14944
Autor:
Terenzi, Lorenzo, Nubert, Julian, Eyschen, Pol, Roth, Pascal, Fei, Simin, Jelavic, Edo, Hutter, Marco
Construction sites are challenging environments for autonomous systems due to their unstructured nature and the presence of dynamic actors, such as workers and machinery. This work presents a comprehensive panoptic scene understanding solution design
Externí odkaz:
http://arxiv.org/abs/2410.04250
Monocular geometric scene understanding combines panoptic segmentation and self-supervised depth estimation, focusing on real-time application in autonomous vehicles. We introduce MGNiceNet, a unified approach that uses a linked kernel formulation fo
Externí odkaz:
http://arxiv.org/abs/2411.11466
With the recent rise of Large Language Models (LLMs), Vision-Language Models (VLMs), and other general foundation models, there is growing potential for multimodal, multi-task embodied agents that can operate in diverse environments given only natura
Externí odkaz:
http://arxiv.org/abs/2411.03540
Autor:
Li, Li
3D LiDAR point cloud data is crucial for scene perception in computer vision, robotics, and autonomous driving. Geometric and semantic scene understanding, involving 3D point clouds, is essential for advancing autonomous driving technologies. However
Externí odkaz:
http://arxiv.org/abs/2411.00600