OutRank: ranking outliers in high dimensional data
Autor: | U. Steinhausen, Emmanuel Müller, Thomas Seidl, Ira Assent |
---|---|
Rok vydání: | 2008 |
Předmět: |
Clustering high-dimensional data
business.industry Computer science Pattern recognition computer.software_genre Ranking (information retrieval) ComputingMethodologies_PATTERNRECOGNITION Ranking Ranking SVM Principal component analysis Outlier Decision boundary Anomaly detection Learning to rank Data mining Artificial intelligence Cluster analysis business computer Categorical variable |
Zdroj: | ICDE Workshops Müller, E, Assent, I, Steinhausen, U & Seidl, T 2008, OutRank : ranking outliers in high dimensional data . in IEEE 24th International Conference on Data Engineering Workshop, 2008. ICDEW 2008 . IEEE, pp. 600-603, International Workshop on Ranking in Databases (DBRank 2008) in conjunction with IEEE 24th International Conference on Data Engineering (ICDE 2008), Cancun, Mexico, 07/04/2008 . https://doi.org/10.1109/ICDEW.2008.4498387 |
DOI: | 10.1109/icdew.2008.4498387 |
Popis: | Outlier detection is an important data mining task for consistency checks, fraud detection, etc. Binary decision making on whether or not an object is an outlier is not appropriate in many applications and moreover hard to parametrize. Thus, recently, methods for outlier ranking have been proposed. Determining the degree of deviation, they do not require setting a decision boundary between outliers and the remaining data. High dimensional and heterogeneous (continuous and categorical attributes) data, however, pose a problem for most outlier ranking algorithms. In this work, we propose our OutRank approach for ranking outliers in heterogeneous high dimensional data. We introduce a consistent model for different attribute types. Our novel scoring functions transform the analyzed structure of the data to a meaningful ranking. Promising results in preliminary experiments show the potential for successful outlier ranking in high dimensional data. |
Databáze: | OpenAIRE |
Externí odkaz: |