Statistical Study of Machine Learning Algorithms Using Parametric and Non-Parametric Tests
Autor: | Gitanjali R. Shinde, Vijay M. Khadse, Parikshit N. Mahalle |
---|---|
Rok vydání: | 2020 |
Předmět: |
business.industry
Computer science Data needs media_common.quotation_subject Nonparametric statistics 020206 networking & telecommunications 02 engineering and technology Machine learning computer.software_genre Multiple comparisons problem Health care 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence Internet of Things business computer Software Normality Parametric statistics media_common |
Zdroj: | International Journal of Ambient Computing and Intelligence. 11:80-105 |
ISSN: | 1941-6245 1941-6237 |
DOI: | 10.4018/ijaci.2020070105 |
Popis: | The emerging area of the internet of things (IoT) generates a large amount of data from IoT applications such as health care, smart cities, etc. This data needs to be analyzed in order to derive useful inferences. Machine learning (ML) plays a significant role in analyzing such data. It becomes difficult to select optimal algorithm from the available set of algorithms/classifiers to obtain best results. The performance of algorithms differs when applied to datasets from different application domains. In learning, it is difficult to understand if the difference in performance is real or due to random variation in test data, training data, or internal randomness of the learning algorithms. This study takes into account these issues during a comparison of ML algorithms for binary and multivariate classification. It helps in providing guidelines for statistical validation of results. The results obtained show that the performance measure of accuracy for one algorithm differs by critical difference (CD) than others over binary and multivariate datasets obtained from different application domains. |
Databáze: | OpenAIRE |
Externí odkaz: |