Learning convolutional neural networks from few samples
Autor: | Roland Schweiger, Markus Thom, Albrecht Rothermel, Raimar Wagner, Günther Palm |
---|---|
Rok vydání: | 2013 |
Předmět: |
Training set
Artificial neural network business.industry Computer science Competitive learning Deep learning Initialization Pattern recognition Semi-supervised learning Machine learning computer.software_genre Convolutional neural network Unsupervised learning Artificial intelligence business Gradient descent Neural coding Transfer of learning computer Feature learning |
Zdroj: | IJCNN |
DOI: | 10.1109/ijcnn.2013.6706969 |
Popis: | Learning Convolutional Neural Networks (CNN) is commonly carried out by plain supervised gradient descent. With sufficient training data, this leads to very competitive results for visual recognition tasks when starting from a random initialization. When the amount of labeled data is limited, CNNs reveal their strong dependence on large amounts of training data. However, recent results have shown that a well chosen optimization starting point can be beneficial for convergence to a good generalizing minimum. This starting point was mostly found using unsupervised feature learning techniques such as sparse coding or transfer learning from related recognition tasks. In this work, we compare these two approaches against a simple patch based initialization scheme and a random initialization of the weights. We show that pre-training helps to train CNNs from few samples and that the correct choice of the initialization scheme can push the network's performance by up to 41% compared to random initialization. |
Databáze: | OpenAIRE |
Externí odkaz: |