Generation method of synthetic training data for mobile OCR system

Autor: Alexander Sheshkus, Alexander V. Gayer, Yulia S. Chernyshova
Rok vydání: 2018
Předmět:
Zdroj: ICMV
DOI: 10.1117/12.2310119
Popis: This paper addresses one of the fundamental problems of machine learning - training data acquiring. Obtaining enough natural training data is rather difficult and expensive. In last years usage of synthetic images has become more beneficial as it allows to save human time and also to provide a huge number of images which otherwise would be difficult to obtain. However, for successful learning on artificial dataset one should try to reduce the gap between natural and synthetic data distributions. In this paper we describe an algorithm which allows to create artificial training datasets for OCR systems using russian passport as a case study.
Databáze: OpenAIRE