Estimation of misclassification probability for a distance-based classifier in high-dimensional data
Autor: | Hiroki Watanabe, Yuki Yamada, Takashi Seo, Masashi Hyodo |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
Clustering high-dimensional data
02 engineering and technology 01 natural sciences 010104 statistics & probability Consistent estimator 0202 electrical engineering electronic engineering information engineering 0101 mathematics asymptotic approximations Mathematics 62E20 Algebra and Number Theory business.industry Estimator 020206 networking & telecommunications Pattern recognition Linear discriminant analysis expected probability of misclassification linear discriminant function Euclidean distance Sample size determination Geometry and Topology Artificial intelligence 62H12 business Classifier (UML) Analysis 62H30 Distance based |
Zdroj: | Hiroshima Math. J. 49, no. 2 (2019), 175-193 |
Popis: | We estimate the misclassification probability of a Euclidean distance-based classifier in high-dimensional data. We discuss two types of estimator: a plug-in type estimator based on the normal approximation of misclassification probability (newly proposed), and an estimator based on the well-known leave-one-out cross-validation method. Both estimators perform consistently when the dimension exceeds the total sample size, and the underlying distribution need not be multivariate normality. We also numerically determine the mean squared errors (MSEs) of these estimators in finite sample applications of high-dimensional scenarios. The newly proposed plug-in type estimator gives smaller MSEs than the estimator based on leave-one-out cross-validation in simulation. |
Databáze: | OpenAIRE |
Externí odkaz: |