Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Hou, Chenshu"'
3D visual grounding aims to identify objects in 3D point cloud scenes that match specific natural language descriptions. This requires the model to not only focus on the target object itself but also to consider the surrounding environment to determi
Externí odkaz:
http://arxiv.org/abs/2407.14491