A perception pipeline exploiting trademark databases for service robots

Autor: Joshua Song
Rok vydání: 2020
Předmět:
DOI: 10.14264/uql.2020.171
Popis: Service robots are a potentially useful aid for elderly people or those with impairments. An example of a task that a service robot might be required to complete is object fetching; i.e. retrieving an object and bringing it to the person. However, the ability to recognize the many household objects a robot may encounter remains an open problem; this problem is the focus of this thesis. A MOVO robot was used for experiments, which is a mobile manipulator equipped with two 6-DOF (degrees of freedom) arms and a Kinect sensor. The first part of this thesis involved developing a perception pipeline for MOVO that processes raw sensor data into a format usable by the motion planner. The performance of several different object recognition algorithms was compared in a cup detection task. It was determined that a CNN (Convolutional Neural Network) outperformed other methods, but it was noted that it requires a significant number of training images.Manually collecting information on all the objects a robot may encounter in a household is tedious and time-consuming. Therefore, the second part of this thesis examined the use of large-scale data from existing trademark databases. These databases contain logo images and a description of the goods and services the logo was registered under. For example, Pepsi is registered under soft drinks. In order to generate training data from the database images, RDSL (Randomization-based Data Synthesizer) was developed based on ideas from domain randomization. RDSL uses 3D rendering software to automatically generate synthetic data from the databases’ logo images. A CNN logo detector trained on RDSL synthetic data outperformed previous logo detectors trained on synthetic data. The use of this logo detector was also demonstrated in a practical implementation for object fetching by MOVO. Tests on this robot indicated promising results, despite not using any manually-labelled real world photos for training.
Databáze: OpenAIRE