Abstrakt: |
YOLOv2 is an object detection algorithm grounded on the Darknet neural network, widely applied in the advanced driver assistance system. Nevertheless, the YOLOv2 algorithm must be accelerated on a high-performance computing platform before being put into practical usage. Various computing platforms have their specific features. The merits or drawbacks of the accelerated platform are hard for the developers to recognize and pick up the right alternative based on real demands. This paper analyzes the pros and cons of embedded GPU and FPGA for improving the YOLOv2 algorithm concerning development speed, power efficiency, and computing performance. The analysis provides the developers with insights into choosing the hardware to optimize the YOLOv2 algorithm. According to the experimental data, it is found that if FPGA is optimized profoundly, the performance of power efficiency, as well as speed, will exceed embedded GPU. However, the FPGA development procedure is tough and demands much more time for developers than the GPU development process. Finally, we propose a balanced method to take advantage of GPU's development speed and FGPA's high performance. [ABSTRACT FROM AUTHOR] |