A Graph-Based Encoding for Evolutionary Convolutional Neural Network Architecture Design
Autor: | Mengjie Zhang, Yanan Sun, Bing Xue, William Irwin-Harris |
---|---|
Rok vydání: | 2019 |
Předmět: |
Training set
Contextual image classification business.industry Computer science Deep learning 02 engineering and technology 010501 environmental sciences Directed acyclic graph 01 natural sciences Convolutional neural network Evolutionary computation Random search Encoding (memory) 0202 electrical engineering electronic engineering information engineering Domain knowledge 020201 artificial intelligence & image processing Artificial intelligence Representation (mathematics) business 0105 earth and related environmental sciences |
Zdroj: | CEC |
DOI: | 10.1109/cec.2019.8790093 |
Popis: | Convolutional neural networks (CNNs) have demonstrated highly effective performance in image classification across a range of data sets. The best performance can only be obtained with CNNs when the appropriate architecture is chosen, which depends on both the volume and nature of the training data available. Many of the state-of-the-art architectures in the literature have been hand-crafted by human researchers, but this requires expertise in CNNs, domain knowledge, or trial-and-error experimentation, often using expensive resources. Recent work based on evolutionary deep learning has offered an alternative, in which evolutionary computation (EC) is applied to automatic architecture search. A key component in evolutionary deep learning is the chosen encoding strategy; however, previous approaches to CNN encoding in EC typically have restrictions in the architectures that can be represented. Here, we propose an encoding strategy based on a directed acyclic graph representation, and introduce an algorithm for random generation of CNN architectures using this encoding. In contrast to previous work, our proposed encoding method is more general, enabling representation of CNNs of arbitrary connectional structure and unbounded depth. We demonstrate its effectiveness using a random search, in which 200 randomly generated CNN architectures are evaluated. To improve the computational efficiency, the 200 CNNs are trained using only 10% of the CIFAR-10 training data; the three bestperforming CNNs are then re-trained on the full training set. The results show that the proposed representation and initialisation method can achieve promising accuracy compared to manually designed architectures, despite the simplicity of the random search approach and the reduced data set. We intend that future work can improve on these results by applying evolutionary search using this encoding. |
Databáze: | OpenAIRE |
Externí odkaz: |