Group Pruning Using a Bounded- $$\ell _p$$ ℓ p Norm for Group Gating and Regularization

Autor:	Volker Fischer, Tim Genewein, Dan Zhang, Thomas Brox, Chaithanya Kumar Mummadi
Rok vydání:	2019
Předmět:	Discrete mathematics Artificial neural network Computer science 02 engineering and technology Gating 010501 environmental sciences 01 natural sciences Regularization (mathematics) Bounded function 0202 electrical engineering electronic engineering information engineering Deep neural networks 020201 artificial intelligence & image processing Lp space Computer Science::Databases 0105 earth and related environmental sciences
Zdroj:	Lecture Notes in Computer Science ISBN: 9783030336752 GCPR
DOI:	10.1007/978-3-030-33676-9_10
Popis:	Deep neural networks achieve state-of-the-art results on several tasks while increasing in complexity. It has been shown that neural networks can be pruned during training by imposing sparsity inducing regularizers. In this paper, we investigate two techniques for group-wise pruning during training in order to improve network efficiency. We propose a gating factor after every convolutional layer to induce channel level sparsity, encouraging insignificant channels to become exactly zero. Further, we introduce and analyse a bounded variant of the $\ell _1$ regularizer, which interpolates between $\ell _1$ and $\ell _0$-norms to retain performance of the network at higher pruning rates. To underline effectiveness of the proposed methods, we show that the number of parameters of ResNet-164, DenseNet-40 and MobileNetV2 can be reduced down by $30\%$, $69\%$, and $75\%$ on CIFAR100 respectively without a significant drop in accuracy. We achieve state-of-the-art pruning results for ResNet-50 with higher accuracy on ImageNet. Furthermore, we show that the light weight MobileNetV2 can further be compressed on ImageNet without a significant drop in performance .
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::a7b10928dbae4b87a7f7129c094a528a https://doi.org/10.1007/978-3-030-33676-9_10 Zobrazit plný text záznamu