Convergence analysis of the batch gradient-based neuro-fuzzy learning algorithm with smoothing L 1/2 regularization for the first-order Takagi–Sugeno system

Autor:	Yan Liu, Dakun Yang
Rok vydání:	2017
Předmět:	0209 industrial biotechnology Mathematical optimization Computational complexity theory Neuro-fuzzy Logic 02 engineering and technology First order Regularization (mathematics) 020901 industrial engineering & automation Takagi sugeno Artificial Intelligence Gradient based algorithm 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Regression problems Algorithm Smoothing Mathematics
Zdroj:	Fuzzy Sets and Systems. 319:28-49
ISSN:	0165-0114
DOI:	10.1016/j.fss.2016.07.003
Popis:	It has been proven that Takagi–Sugeno systems are universal approximators, and they are applied widely to classification and regression problems. The main challenges of these models are convergence analysis and their computational complexity due to the large number of connections and the pruning of unnecessary parameters. The neuro-fuzzy learning algorithm involves two tasks: generating comparable sparse networks and training the parameters. In addition, regularization methods have attracted increasing attention for network pruning, particularly the L q ( 0 q 1 ) regularizer after L 1 regularization, which can obtain better solutions to sparsity problems. The L 1 / 2 regularizer has a specific sparsity capacity and it is representative of L q ( 0 q 1 ) regularizations. However, the nonsmoothness of the L 1 / 2 regularizer may lead to oscillations in the learning process. In this study, we propose a gradient-based neuro-fuzzy learning algorithm with a smoothing L 1 / 2 regularization for the first-order Takagi–Sugeno fuzzy inference system. The proposed approach has the following three advantages: (i) it enhances the original L 1 / 2 regularizer by eliminating the oscillation of the gradient in the cost function during the training; (ii) it performs better by pruning inactive connections, where the number of the redundant connections for removal is higher than that generated by the original L 1 / 2 regularizer, while it is also implemented by simultaneous structure and parameter learning processes; and (iii) it is possible to demonstrate the theoretical convergence analysis of this learning method, which we focus on explicitly. We also provide a series of simulations to demonstrate that the smoothing L 1 / 2 regularization can often obtain more compressive representations than the current L 1 / 2 regularization.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::f19b05df0cf6e7600943993f79999268 https://doi.org/10.1016/j.fss.2016.07.003 Zobrazit plný text záznamu Full Text from ScienceDirect