A Study of Data Augmentation for Handwritten Character Recognition using Deep Learning

Autor: Hidehiro Ohki, Taihei Hayashi, Toshiya Takami, Keiji Gyohten
Rok vydání: 2018
Předmět:
Zdroj: ICFHR
DOI: 10.1109/icfhr-2018.2018.00102
Popis: While convolutional neural networks have made significant achievements in the field of handwriting recognition in recent years, large amounts of training data are required to obtain satisfactory results. To prepare large amounts of image data for training without increased labor, there is a way of increasing the number of images by applying general image processing methods, so-called data augmentation. However, it is difficult to generate character images like those written by different people and to overcome the problems related to the lack of training data by using conventional data augmentation methods. In this paper, we propose a method of acquiring the probability distribution of the features related to the character structure and generating character images of various handwritings using the probability distribution. The proposed method obtains statistical character structure models composed of probability distributions of strokes by learning from character image data. By generating strokes based on the probability distribution of each stroke and assembling them into a character, it becomes possible to generate character images of various handwriting samples not influenced by the original images. In the comparative experiments of handwritten character recognition with a convolutional neural network, good results could be obtained using not only conventional data augmentation methods but also the proposed method together.
Databáze: OpenAIRE