Semantics of Voids within Data: Ignorance-Aware Machine Learning

Autor:	Vagan Terziyan, Anton Nikulin
Jazyk:	angličtina
Rok vydání:	2021
Předmět:	data semantics data mining classification ignorance data voids prototype selection Geography (General) G1-922
Zdroj:	ISPRS International Journal of Geo-Information, Vol 10, Iss 4, p 246 (2021)
Druh dokumentu:	article
ISSN:	2220-9964
DOI:	10.3390/ijgi10040246
Popis:	Operating with ignorance is an important concern of geographical information science when the objective is to discover knowledge from the imperfect spatial data. Data mining (driven by knowledge discovery tools) is about processing available (observed, known, and understood) samples of data aiming to build a model (e.g., a classifier) to handle data samples that are not yet observed, known, or understood. These tools traditionally take semantically labeled samples of the available data (known facts) as an input for learning. We want to challenge the indispensability of this approach, and we suggest considering the things the other way around. What if the task would be as follows: how to build a model based on the semantics of our ignorance, i.e., by processing the shape of “voids” within the available data space? Can we improve traditional classification by also modeling the ignorance? In this paper, we provide some algorithms for the discovery and visualization of the ignorance zones in two-dimensional data spaces and design two ignorance-aware smart prototype selection techniques (incremental and adversarial) to improve the performance of the nearest neighbor classifiers. We present experiments with artificial and real datasets to test the concept of the usefulness of ignorance semantics discovery.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/9ee1afcc556e48b5ae7a2b16ba106233 Zobrazit plný text záznamu View record in DOAJ