Bias in Deep Neural Networks in Land Use Characterization for International Development
Autor: | Diego Kiedanski, Alan Descoins, Do-Hyung Kim, Braulio Ríos, Christopher Fabian, Guzmán López, Iyke Maduako, Shilpa Arora, Naroa Zurutuza |
---|---|
Rok vydání: | 2021 |
Předmět: |
Computer science
Science Population 0211 other engineering and technologies 02 engineering and technology Land cover Machine learning computer.software_genre remote sensing trustworthy AI 020204 information systems Covariate 0202 electrical engineering electronic engineering information engineering Representation (mathematics) education 021101 geological & geomatics engineering education.field_of_study Data collection Artificial neural network Land use business.industry land use Identification (information) General Earth and Planetary Sciences Artificial intelligence business computer |
Zdroj: | Remote Sensing, Vol 13, Iss 2908, p 2908 (2021) |
ISSN: | 2072-4292 |
DOI: | 10.3390/rs13152908 |
Popis: | Understanding the biases in Deep Neural Networks (DNN) based algorithms is gaining paramount importance due to its increased applications on many real-world problems. A known problem of DNN penalizing the underrepresented population could undermine the efficacy of development projects dependent on data produced using DNN-based models. In spite of this, the problems of biases in DNN for Land Use and Land Cover Classification (LULCC) have not been a subject of many studies. In this study, we explore ways to quantify biases in DNN for land use with an example of identifying school buildings in Colombia from satellite imagery. We implement a DNN-based model by fine-tuning an existing, pre-trained model for school building identification. The model achieved overall 84% accuracy. Then, we used socioeconomic covariates to analyze possible biases in the learned representation. The retrained deep neural network was used to extract visual features (embeddings) from satellite image tiles. The embeddings were clustered into four subtypes of schools, and the accuracy of the neural network model was assessed for each cluster. The distributions of various socioeconomic covariates by clusters were analyzed to identify the links between the model accuracy and the aforementioned covariates. Our results indicate that the model accuracy is lowest (57%) where the characteristics of the landscape are predominantly related to poverty and remoteness, which confirms our original assumption on the heterogeneous performances of Artificial Intelligence (AI) algorithms and their biases. Based on our findings, we identify possible sources of bias and present suggestions on how to prepare a balanced training dataset that would result in less biased AI algorithms. The framework used in our study to better understand biases in DNN models would be useful when Machine Learning (ML) techniques are adopted in lieu of ground-based data collection for international development programs. Because such programs aim to solve issues of social inequality, MLs are only applicable when they are transparent and accountable. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |