Rethinking the activation function in lightweight network.

Autor: Yang, Lu, Song, Qing, Fan, Zimeng, Liu, Chun, Hu, Mengjie
Předmět:
Zdroj: Multimedia Tools & Applications; Jan2023, Vol. 82 Issue 1, p1355-1371, 17p
Abstrakt: Activation function plays an important role in neural network. Applying activation function in network appropriately can improve accuracy and speed up converging. In this paper, we study the information loss caused by activation function in lightweight network, and discusses how to use activation function with negative value to solve this issue. We propose a method to minimize the changes to the existing network, we only need to replace ReLU with Swish in the appropriate position of lightweight network. We call this method enriching activation. Enriching activation is achieved by utilizing activation functions with negative value in the position where ReLU causes information loss. We also propose a novel activation function called (H)-SwishX for enriching activation, which adds a learnable maximal value to (H)-Swish. (H)-SwishX learns significant maximal value in each layer of network to reduce the accuracy reduction during the lightweight network quantization. We verify this enriching activation scheme on popular lightweight networks. Compared to existing activation schemes adopted by these lightweight networks, we demonstrate performance improvements on CIFAR-10 and ImageNet datasets. We further demonstrate that enriching activation has a good ability for transfer learning, and measure the performance on MSCOCO object detection. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index