Learning convolutional neural networks from few samples

Autor:	Roland Schweiger, Markus Thom, Albrecht Rothermel, Raimar Wagner, Günther Palm
Rok vydání:	2013
Předmět:	Training set Artificial neural network business.industry Computer science Competitive learning Deep learning Initialization Pattern recognition Semi-supervised learning Machine learning computer.software_genre Convolutional neural network Unsupervised learning Artificial intelligence business Gradient descent Neural coding Transfer of learning computer Feature learning
Zdroj:	IJCNN
DOI:	10.1109/ijcnn.2013.6706969
Popis:	Learning Convolutional Neural Networks (CNN) is commonly carried out by plain supervised gradient descent. With sufficient training data, this leads to very competitive results for visual recognition tasks when starting from a random initialization. When the amount of labeled data is limited, CNNs reveal their strong dependence on large amounts of training data. However, recent results have shown that a well chosen optimization starting point can be beneficial for convergence to a good generalizing minimum. This starting point was mostly found using unsupervised feature learning techniques such as sparse coding or transfer learning from related recognition tasks. In this work, we compare these two approaches against a simple patch based initialization scheme and a random initialization of the weights. We show that pre-training helps to train CNNs from few samples and that the correct choice of the initialization scheme can push the network's performance by up to 41% compared to random initialization.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::2542978a6502398d07eb58c92bb14231 https://doi.org/10.1109/ijcnn.2013.6706969 Zobrazit plný text záznamu