Zobrazeno 1 - 10
of 21
pro vyhledávání: '"Gwak, JunYoung"'
Recent research in multi-task learning reveals the benefit of solving related problems in a single neural network. 3D object detection and multi-object tracking (MOT) are two heavily intertwined problems predicting and associating an object instance
Externí odkaz:
http://arxiv.org/abs/2208.10056
3D object detection has been widely studied due to its potential applicability to many promising areas such as robotics and augmented reality. Yet, the sparse nature of the 3D data poses unique challenges to this task. Most notably, the observable su
Externí odkaz:
http://arxiv.org/abs/2006.12356
Autor:
Shenoi, Abhijeet, Patel, Mihir, Gwak, JunYoung, Goebel, Patrick, Sadeghian, Amir, Rezatofighi, Hamid, Martín-Martín, Roberto, Savarese, Silvio
Robots navigating autonomously need to perceive and track the motion of objects and other agents in its surroundings. This information enables planning and executing robust and safe trajectories. To facilitate these processes, the motion should be pe
Externí odkaz:
http://arxiv.org/abs/2002.08397
Autor:
Martín-Martín, Roberto, Patel, Mihir, Rezatofighi, Hamid, Shenoi, Abhijeet, Gwak, JunYoung, Frankel, Eric, Sadeghian, Amir, Savarese, Silvio
We present JRDB, a novel egocentric dataset collected from our social mobile manipulator JackRabbot. The dataset includes 64 minutes of annotated multimodal sensor data including stereo cylindrical 360$^\circ$ RGB video at 15 fps, 3D point clouds fro
Externí odkaz:
http://arxiv.org/abs/1910.11792
Autor:
Armeni, Iro, He, Zhi-Yang, Gwak, JunYoung, Zamir, Amir R., Fischer, Martin, Malik, Jitendra, Savarese, Silvio
A comprehensive semantic understanding of a scene is important for many applications - but in what space should diverse semantic information (e.g., objects, scene categories, material types, texture, etc.) be grounded and what should be its structure
Externí odkaz:
http://arxiv.org/abs/1910.02527
In many robotics and VR/AR applications, 3D-videos are readily-available sources of input (a continuous sequence of depth images, or LIDAR scans). However, those 3D-videos are processed frame-by-frame either through 2D convnets or 3D perception algor
Externí odkaz:
http://arxiv.org/abs/1904.08755
Autor:
Rezatofighi, Hamid, Tsoi, Nathan, Gwak, JunYoung, Sadeghian, Amir, Reid, Ian, Savarese, Silvio
Intersection over Union (IoU) is the most popular evaluation metric used in the object detection benchmarks. However, there is a gap between optimizing the commonly used distance losses for regressing the parameters of a bounding box and maximizing t
Externí odkaz:
http://arxiv.org/abs/1902.09630
3D semantic scene labeling is fundamental to agents operating in the real world. In particular, labeling raw 3D point sets from sensors provides fine-grained semantics. Recent works leverage the capabilities of Neural Networks (NNs), but are limited
Externí odkaz:
http://arxiv.org/abs/1710.07563
Autor:
Kurenkov, Andrey, Ji, Jingwei, Garg, Animesh, Mehta, Viraj, Gwak, JunYoung, Choy, Christopher, Savarese, Silvio
3D reconstruction from a single image is a key problem in multiple applications ranging from robotic manipulation to augmented reality. Prior methods have tackled this problem through generative models which predict 3D reconstructions as voxels or po
Externí odkaz:
http://arxiv.org/abs/1708.04672
Supervised 3D reconstruction has witnessed a significant progress through the use of deep neural networks. However, this increase in performance requires large scale annotations of 2D/3D data. In this paper, we explore inexpensive 2D supervision as a
Externí odkaz:
http://arxiv.org/abs/1705.10904