Efficient attribute reduction from the viewpoint of discernibility

Autor:	Afeng Yang, Mi He, Min Lu, Jun Zhang, Shuhua Teng, Yongjian Nian
Rok vydání:	2016
Předmět:	Information Systems and Management business.industry Heuristic computer.software_genre Machine learning Computer Science Applications Theoretical Computer Science Set (abstract data type) Reduction (complexity) Artificial Intelligence Control and Systems Engineering Pattern recognition (psychology) Preprocessor Attribute domain Data mining Rough set Artificial intelligence business Decision table computer Software Mathematics
Zdroj:	Information Sciences. 326:297-314
ISSN:	0020-0255
Popis:	Attribute reduction is an important preprocessing step in pattern recognition, machine learning and data mining. As an effective method for attribute reduction, rough set theory offers a useful and formal methodology. It retains the discernibility power of the original datasets; thus, attribute reduction has been extensively studied in rough set theory. However, the inefficiency of the existing attribute reduction algorithms limits the application of rough sets. In this paper, we first analyse the limitations of existing attribute reduction algorithms. Then, a novel measure of attribute quality, called the relative discernibility degree, is proposed based on the discernibility. Theoretical analysis shows that this measure can find relative dispensable attributes and remain unchanged after removing the relative dispensable attributes and redundant objects in the process of selecting attributes. This property can be used to reduce the search space and accelerate the heuristic process of attribute reduction. Consequently, a new attribute reduction algorithm is proposed from the viewpoint of discernibility. Furthermore, the relationships among the reduction definitions of the algebra view, information view and discernibility view are derived. Some non-equivalent relationships among these views of rough set theory in inconsistent decision tables are discovered. A set of numerical experiments was conducted on UCI datasets. Experimental results show that the proposed algorithm is effective and efficient and is applicable to the case of large-scale datasets.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::884f27314a496fec9b526bb17db831a5 https://doi.org/10.1016/j.ins.2015.07.052 Zobrazit plný text záznamu Full Text from ScienceDirect