Learning convolutional neural networks from few samples

Autor: Roland Schweiger, Markus Thom, Albrecht Rothermel, Raimar Wagner, Günther Palm
Rok vydání: 2013
Předmět:
Zdroj: IJCNN
DOI: 10.1109/ijcnn.2013.6706969
Popis: Learning Convolutional Neural Networks (CNN) is commonly carried out by plain supervised gradient descent. With sufficient training data, this leads to very competitive results for visual recognition tasks when starting from a random initialization. When the amount of labeled data is limited, CNNs reveal their strong dependence on large amounts of training data. However, recent results have shown that a well chosen optimization starting point can be beneficial for convergence to a good generalizing minimum. This starting point was mostly found using unsupervised feature learning techniques such as sparse coding or transfer learning from related recognition tasks. In this work, we compare these two approaches against a simple patch based initialization scheme and a random initialization of the weights. We show that pre-training helps to train CNNs from few samples and that the correct choice of the initialization scheme can push the network's performance by up to 41% compared to random initialization.
Databáze: OpenAIRE