Popis: |
Nowadays, most computer vision problems are solved using artificial intelligence. These techniques outperform traditional computer vision algorithms in most scenarios and even allow the use of computer vision for a variety of challenging new fields. One of these fields is remote sensing. Using artificial intelligence we are able to automatically extract complex metadata, which aids in the decision making process of governments, industries, etc. Nevertheless, several key challenges remain to be solved in order to successfully deploy these algorithms: 1. One of the main challenges in this field is to cope with the huge amount of data. Indeed, aerial orthomosaics are often in the order of 10⁹ square pixels large. Detecting objects, which can be as small as a few hundred square pixels in size, quickly becomes extremely challenging. 2. Privacy is a big social issue with remote sensing data. A lot of aerial images indeed capture data across huge regions, disregarding private areas or people who might be visible in the data. One possible solution is to process the images on-board on the sensor devices themselves. However, running artificial neural networks on constrained devices remains a major challenge. 3. The majority of computer vision algorithms work on traditional red-green-blue image data. However, a lot of remote sensing sensors offer additional types of data, giving opportunities to improve our algorithms. We still need to find a solution to optimally use these new types of data and to integrate them with the traditional image data. During this PhD we worked on three different object detection use cases, while finding an optimal solution for the aforementioned challenges. Firstly, we developed a pipeline to run object detection networks on remote sensing data. Our initial pipeline processed the orthomosaic with a sliding window, adding overlap between the different image patches. While this proved it is possible to adapt artificial intelligence to remote sensing use cases, we also improved the results significantly by implementing a series of scene-specific pre- and post-processing steps. Secondly, we researched the added value of sensor fusion. More specifically, we developed a technique to merge different types of data in a neural network and applied it to object detection on red-green-blue and depth data. We tested our technique on a variety of different datasets, demonstrating the benefit of fusing this data for both natural and remote sensing images. Thirdly, we implemented a series of techniques in order to reduce the computational complexity of our algorithms, with the goal of running them in real-time on embedded devices. By combining mobile convolutions, pruning and quantisation techniques, we were able to reduce the complexity of a neural network significantly, without sacrificing accuracy. To summarize, we developed a variety of techniques which enable object detection networks to run on remote sensing data, clearly demonstrating its feasibility. We also showed that it is possible to further increase the accuracy of our models, when different types of data are available. Finally, we determined that many neural networks are oversized for most tasks, allowing to reduce the computational complexity without sacrificing accuracy. status: published |