Ego-Motion Estimation Using Recurrent Convolutional Neural Networks through Optical Flow Learning
Autor: | Hongjian Wei, Xing Hu, Yingping Huang, Baigan Zhao |
---|---|
Rok vydání: | 2021 |
Předmět: |
0209 industrial biotechnology
Computer Networks and Communications Computer science Optical flow lcsh:TK7800-8360 02 engineering and technology Convolutional neural network optical flow subspace 020901 industrial engineering & automation visual odometry Motion estimation 0202 electrical engineering electronic engineering information engineering Feature (machine learning) Electrical and Electronic Engineering Visual odometry business.industry Deep learning lcsh:Electronics deep learning 020206 networking & telecommunications Pattern recognition Recurrent neural network Hardware and Architecture Control and Systems Engineering Signal Processing Robot recurrent neural network Artificial intelligence business Subspace topology |
Zdroj: | Electronics Volume 10 Issue 3 Electronics, Vol 10, Iss 222, p 222 (2021) |
ISSN: | 2079-9292 |
DOI: | 10.3390/electronics10030222 |
Popis: | Visual odometry (VO) refers to incremental estimation of the motion state of an agent (e.g., vehicle and robot) by using image information, and is a key component of modern localization and navigation systems. Addressing the monocular VO problem, this paper presents a novel end-to-end network for estimation of camera ego-motion. The network learns the latent subspace of optical flow (OF) and models sequential dynamics so that the motion estimation is constrained by the relations between sequential images. We compute the OF field of consecutive images and extract the latent OF representation in a self-encoding manner. A Recurrent Neural Network is then followed to examine the OF changes, i.e., to conduct sequential learning. The extracted sequential OF subspace is used to compute the regression of the 6-dimensional pose vector. We derive three models with different network structures and different training schemes: LS-CNN-VO, LS-AE-VO, and LS-RCNN-VO. Particularly, we separately train the encoder in an unsupervised manner. By this means, we avoid non-convergence during the training of the whole network and allow more generalized and effective feature representation. Substantial experiments have been conducted on KITTI and Malaga datasets, and the results demonstrate that our LS-RCNN-VO outperforms the existing learning-based VO approaches. |
Databáze: | OpenAIRE |
Externí odkaz: |