Outlier elimination using granular box regression
Autor: | Suhaimi Ibrahim, M. Reza Mashinchi, Ali Selamat, Hamido Fujita |
---|---|
Rok vydání: | 2016 |
Předmět: |
Linear model
Volume (computing) 020206 networking & telecommunications Regression analysis 02 engineering and technology computer.software_genre Regression Data set ComputingMethodologies_PATTERNRECOGNITION Outlier elimination Hardware and Architecture Signal Processing Linear regression Outlier 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Data mining computer Software Information Systems Mathematics |
Zdroj: | Information Fusion. 27:161-169 |
ISSN: | 1566-2535 |
Popis: | Display Omitted We employ granular box regression to eliminate the outliers.We propose penalty schemes, on instances or boxes, to configure granular boxes.We investigate the performance in terms of regression analysis and box configuration.It offers better linear models for data sets with high and low rates of outliers.The penalty scheme on instances improves 72% of regression and 99% of box configuration. A regression method desires to fit the curve on a data set irrespective of outliers. This paper modifies the granular box regression approaches to deal with data sets with outliers. Each approach incorporates a three-stage procedure includes granular box configuration, outlier elimination, and linear regression analysis. The first stage investigates two objective functions each applies different penalty schemes on boxes or instances. The second stage investigates two methods of outlier elimination to, then, perform the linear regression in the third stage. The performance of the proposed granular box regressions are investigated in terms of: volume of boxes, insensitivity of boxes to outliers, elapsed time for box configuration, and error of regression. The proposed approach offers a better linear model, with smaller error, on the given data sets containing varieties of outlier rates. The investigation shows the superiority of applying penalty scheme on instances. |
Databáze: | OpenAIRE |
Externí odkaz: |