Convergence analysis of the batch gradient-based neuro-fuzzy learning algorithm with smoothing L 1/2 regularization for the first-order Takagi–Sugeno system
Autor: | Yan Liu, Dakun Yang |
---|---|
Rok vydání: | 2017 |
Předmět: |
0209 industrial biotechnology
Mathematical optimization Computational complexity theory Neuro-fuzzy Logic 02 engineering and technology First order Regularization (mathematics) 020901 industrial engineering & automation Takagi sugeno Artificial Intelligence Gradient based algorithm 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Regression problems Algorithm Smoothing Mathematics |
Zdroj: | Fuzzy Sets and Systems. 319:28-49 |
ISSN: | 0165-0114 |
DOI: | 10.1016/j.fss.2016.07.003 |
Popis: | It has been proven that Takagi–Sugeno systems are universal approximators, and they are applied widely to classification and regression problems. The main challenges of these models are convergence analysis and their computational complexity due to the large number of connections and the pruning of unnecessary parameters. The neuro-fuzzy learning algorithm involves two tasks: generating comparable sparse networks and training the parameters. In addition, regularization methods have attracted increasing attention for network pruning, particularly the L q ( 0 q 1 ) regularizer after L 1 regularization, which can obtain better solutions to sparsity problems. The L 1 / 2 regularizer has a specific sparsity capacity and it is representative of L q ( 0 q 1 ) regularizations. However, the nonsmoothness of the L 1 / 2 regularizer may lead to oscillations in the learning process. In this study, we propose a gradient-based neuro-fuzzy learning algorithm with a smoothing L 1 / 2 regularization for the first-order Takagi–Sugeno fuzzy inference system. The proposed approach has the following three advantages: (i) it enhances the original L 1 / 2 regularizer by eliminating the oscillation of the gradient in the cost function during the training; (ii) it performs better by pruning inactive connections, where the number of the redundant connections for removal is higher than that generated by the original L 1 / 2 regularizer, while it is also implemented by simultaneous structure and parameter learning processes; and (iii) it is possible to demonstrate the theoretical convergence analysis of this learning method, which we focus on explicitly. We also provide a series of simulations to demonstrate that the smoothing L 1 / 2 regularization can often obtain more compressive representations than the current L 1 / 2 regularization. |
Databáze: | OpenAIRE |
Externí odkaz: |