Evidence integration credal classification algorithm versus missing data distributions

Autor: Zuo-wei Zhang, Ji-huan He, Zong-fa Ma, Xing-yu Zhu, Zhe Liu
Přispěvatelé: Northwestern Polytechnical University [Xi'an] (NPU), Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES), Xi’an University of Architecture and Technology, Declarative & Reliable management of Uncertain, user-generated Interlinked Data (DRUID), GESTION DES DONNÉES ET DE LA CONNAISSANCE (IRISA-D7), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Université de Rennes (UR)
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Zdroj: Information Sciences
Information Sciences, Elsevier, 2021, 569, pp.39-54. ⟨10.1016/j.ins.2021.04.008⟩
Information Sciences, 2021, 569, pp.39-54. ⟨10.1016/j.ins.2021.04.008⟩
ISSN: 0020-0255
DOI: 10.1016/j.ins.2021.04.008⟩
Popis: International audience; In complex incomplete pattern classification, the classification results produced by a single classifier and used for decision-making may be quite unreliable and uncertain due to the random distribution of missing data. This paper proposes a new evidence integration credal classification algorithm (EICA) for multiple classifiers working on different attributes, aiming to reduce the negative impact on incomplete pattern classification by modeling the missing values locally. In EICA, the dataset is first grouped into several subsets, and missing values in each subset are estimated by similar subpatterns with different weights. The similarity is measured by discounting the overall similarity of subpatterns and the local similarity of attributes on the basis of fully exploiting the distribution characteristics of the attributes. The greater the variation in distribution across classes, the greater the weight. The classification results of the edited subpatterns with different discounting factors obtained by the optimization function can often provide (more or less) useful information for the classification of the query pattern. Thus, these discounted pieces of evidence (outputs) represented by basic belief assignments (BBAs) are globally fused to classify the query pattern on the basis of evidence theory. The validity has been demonstrated with various real datasets.
Databáze: OpenAIRE