Autor: |
Zhu, Delong, Fang, Yuqi, Min, Zhe, Ho, Danny, Meng, Max Q.-H. |
Předmět: |
|
Zdroj: |
IEEE Transactions on Industrial Electronics; Jan2022, Vol. 69 Issue 1, p582-591, 10p |
Abstrakt: |
Autonomous elevator operation is considered a promising solution for mobile navigation in office buildings. As a fundamental function, elevator button recognition remains unsolved due to the challenging image conditions and severe data imbalance problem. In this article, we propose an accurate and efficient framework, named OCR-RCNN, for elevator button recognition. The framework is comprised of an region-based convolutional neural network (R-CNN)-based button detector and an attention-RNN-based character recognizer. Leveraging the two components, we further propose an end-to-end architecture and a cascaded architecture to explore the most effective network design for the framework. Moreover, a perspective distortion removal algorithm is also developed to enhance the inference performance of OCR-RCNN. Another key contribution of this work is that we release the first large-scale elevator panel dataset with 2005 images and 21 767 button labels. Extensive experiments are conducted on the released dataset and other two publicly available datasets. The proposed framework achieves an F1 score of 0.94, 1.00, and 1.00 in detection task, and an accuracy of 79.6% 96.5%, and 96.4% in character recognition task. The results demonstrate the advantages of our method, outperforming alternative strategies and other state-of-the-art methods in the literature. The data and code are available on the project webpage https://github.com/zhudelong/ocr-rcnn-v2. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|