A principled approach to network-based classification and data representation

Autor:	Ian H. Jarman, Héctor Ruiz, Terence A. Etchells, José D. Martín, Paulo J. G. Lisboa
Rok vydání:	2013
Předmět:	business.industry Cognitive Neuroscience Fisher kernel Pattern recognition Probability density function Conditional probability distribution External Data Representation computer.software_genre Computer Science Applications Weighting Euclidean distance symbols.namesake Data point Artificial Intelligence symbols Artificial intelligence Data mining Fisher information business computer Mathematics
Zdroj:	Neurocomputing. 112:79-91
ISSN:	0925-2312
DOI:	10.1016/j.neucom.2012.12.050
Popis:	Measures of similarity are fundamental in pattern recognition and data mining. Typically the Euclidean metric is used in this context, weighting all variables equally and therefore assuming equal relevance, which is very rare in real applications. In contrast, given an estimate of a conditional density function, the Fisher information calculated in primary data space implicitly measures the relevance of variables in a principled way by reference to auxiliary data such as class labels. This paper proposes a framework that uses a distance metric based on Fisher information to construct similarity networks that achieve a more informative and principled representation of data. The framework enables efficient retrieval of reference cases from which a weighted nearest neighbour classifier closely approximates the original density function. Importantly, the identification of nearby data points permits the retrieval of further information with potential relevance to the assessment of a new case. The practical application of the method is illustrated for six benchmark datasets.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::2b784ffcbb9e52e67940d3c6a96b0c06 https://doi.org/10.1016/j.neucom.2012.12.050 Zobrazit plný text záznamu Full Text from ScienceDirect