Popis: |
In this paper, we introduce a new channel pruning method called Similarity-aware Channel Pruning to simultaneously accelerate and compress CNNs. Most existing channel pruning methods focus on pruning channels by the filter saliency. However, small magnitude doesn't necessarily lead to low saliency to the outputs (influenced by the values of the inputs and the cumulative calculation). Hence, we propose to determine the pruning channels by first comparing the similarity of output feature maps of a layer. Based on the similarity, we reversely find all the corresponding weight groups (filters, mean, variance, bias, etc.) and decide which channels can be removed. Then, we generate the new parameters of the next layer by Weight Combination strategy so as to reconstruct the outputs with fewer channels. The Weight Combination strategy also makes it easier to recover the accuracy through fine-tuning. The key of the proposed method is to focus on the redundancy of feature maps directly. We seek to reduce the redundancy by removing the redundant weight groups of the current layer that generate the similar outputs and discarding the related channels of the next layer that take the removed feature maps as inputs. Extensive experiments using several advanced CNN architecures have verified the effectiveness of our approach. |