Autor: |
Seungyon Lee, Jan P. Allebach, Zygmunt Pizlo, Aziza Satkhozhina, Ildus Ahmadullin |
Rok vydání: |
2012 |
Předmět: |
|
Zdroj: |
Imaging and Printing in a Web 2.0 World III. |
ISSN: |
0277-786X |
DOI: |
10.1117/12.910860 |
Popis: |
Applications that classify and search documents based on their visual appearance need to recognize what document features are the most critical to human perception when humans compare the documents. This paper presents the results of a psychophysical experiment where subjects were asked to group the documents based on their visual similarity. Results from 15 subjects were saved into similarity matrices, and tested for inter-rater agreement. The similarity matrix averaged across the subjects was analyzed using agglomerative hierarchical clustering to identify the clusters. The humans' clustering was approximated with the weighted sum of four distance matrices that we calculated based on four document features. We identified the relative importance of the document features using an optimization method. Then, we tested the approximation using K-fold cross validation and the K-nearest neighbor algorithm. The results of the testing confirm the effectiveness of our approach. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|