ReLTanh: An activation function with vanishing gradient resistance for SAE-based DNNs and its application to rotating machinery fault diagnosis

Autor: Yi Qin, Xin Wang, Sheng Xiang, Yi Wang, Haizhou Chen
Rok vydání: 2019
Předmět:
Zdroj: Neurocomputing. 363:88-98
ISSN: 0925-2312
DOI: 10.1016/j.neucom.2019.07.017
Popis: Tanh is a sigmoidal activation function that suffers from vanishing gradient problem, so researchers have proposed some alternative functions including rectified linear unit (ReLU), however those vanishing-proof functions bring some other problem such as bias shift problem and noise-sensitiveness as well. Mainly for overcoming vanishing gradient problem as well as avoiding to introduce other problems, we propose a new activation function named Rectified Linear Tanh (ReLTanh) by improving traditional Tanh. ReLTanh is constructed by replacing Tanh’s saturated waveforms in positive and negative inactive regions with two straight lines, and the slopes of the lines are calculated by the Tanh’s derivatives at two learnable thresholds. The middle Tanh waveform provides ReLTanh with the ability of nonlinear fitting, and the linear parts contribute to the relief of vanishing gradient problem. Besides, thresholds of ReLTanh that determines the slopes of line parts are learnable, so it can tolerate the variation of inputs and help to minimize the cost function and maximize the data fitting performance. Theoretical proofs by mathematical derivations demonstrate that ReLTanh is available to diminish vanishing gradient problem and feasible to train thresholds. For verifying the practical feasibility and effectiveness of ReLTanh, fault diagnosis experiments for planetary gearboxes and rolling bearings are conducted by stacked autoencoder-based deep neural network (SAE-based DNNs). ReLTanh alleviates successfully vanishing gradient problem and the it learns faster, more steadily and precisely than Tanh, which is consistent with the theoretical analysis. Additionally, ReLTanh surpasses other popular activation functions such as ReLU family, Hexpo and Swish, which shows that ReLTanh has certain applying potential and researching value.
Databáze: OpenAIRE