Autor: |
Juričić, Vedran, Soleša, Dragan, Dunđer, Ivan |
Přispěvatelé: |
Damir Boras, Nives Mikelić Preradović, Francisco Moya, Mohamed Roushdy, Abdel-Badeeh M. Salem |
Jazyk: |
angličtina |
Rok vydání: |
2013 |
Předmět: |
|
Popis: |
This paper analyses the changes that occur in a document comparison system when changing the length of hash values of documents’ n-grams, that is, when changing the number of bits that are used to store hash values. A hash-based document comparison system was developed and used to perform different analyses. The authors analyzed dependencies between hash value length and disk space requirements, comparison process time and F-measure, in order to find the optimum length, a balance between the best performance and the lowest space and time requirements. Because of the regularity of those dependencies, the authors tried to approximate values obtained by testing with exponential and trigonometric functions. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|