Popis: |
The presence of robotic products in our lives is steadily increasing, with the first commercial home and industrial robots becoming available on the market. While existing systems mostly rely on robust spatial navigation in order to fulfil their roles, one can easily imagine more useful and harder tasks to automate. In order to perform these tasks autonomously, robots need to be able to understand the geometry and the structure of their surroundings. While this can be achieved using a range of different sensors, monocular cameras are the most costeffective, ubiquitous and interesting from the research perspective. The most popular technique used for recovering 3D geometric information from monocular imagery is Visual Simultaneous Localisation and Mapping (SLAM). While traditional landmark-based SLAM systems have demonstrated high levels of accuracy and robustness for trajectory estimation, the maps produced by them do not provide enough information about the environment to enable interactive robotics. State of the art dense SLAM methods allow obtaining rich scene reconstructions but are fragile and suffer from a number of limitations. A particular problem is the lack of an efficient representation for the observed scenes. Current methods use over-parametrised and geometry-focused representations of the environment, which do not make use of the inherent semantic structure of man-made environments so easily discovered and exploited by human brains. This thesis aims to explore the recent advances in the field of deep learning in order to address these issues with dense SLAM. New representations based on neural networks are proposed in order to enable more efficient and robust reasoning about the observed scenes. We also demonstrate novel SLAM systems that make use of these representations, pointing towards the next generation of SLAM research. Open Access |