Investigation of the influence of outliers on text documents probabilistic classifier quality
Autor: | Andrey I. Kapitanov, Elena L. Fedotova, Vladimir M. Troyanovskiy, Valentin V. Slyusar, Ilona I. Kapitanova |
---|---|
Rok vydání: | 2017 |
Předmět: |
Probabilistic classification
business.industry Computer science Pattern recognition Bayes classifier Quadratic classifier Machine learning computer.software_genre Naive Bayes classifier Statistical classification ComputingMethodologies_PATTERNRECOGNITION Margin classifier Artificial intelligence business F1 score computer Classifier (UML) |
Zdroj: | 2017 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). |
Popis: | In this paper we investigate the influence of outliers in the training set on the probabilistic classifier quality. By the example of naive Bayes classifier we show how the qualitative characteristics depend on the percentage of outliers' ratio. This dependence is built on three basic metrics of the classifier quality: precision, recall and F1 score. At the end we propose method for reducing the outliers influence on the classifier quality by approximating a piecewise linear function, and further using of gradient methods. |
Databáze: | OpenAIRE |
Externí odkaz: |