Autor: |
Jungiewicz, Michał, Łopuszyński, Michał |
Rok vydání: |
2014 |
Předmět: |
|
Zdroj: |
Lecture Notes in Computer Science, Volume 8686, Springer 2014, pp 65-70 |
Druh dokumentu: |
Working Paper |
DOI: |
10.1007/978-3-319-10888-9_7 |
Popis: |
In this work, we present an application of the recently proposed unsupervised keyword extraction algorithm RAKE to a corpus of Polish legal texts from the field of public procurement. RAKE is essentially a language and domain independent method. Its only language-specific input is a stoplist containing a set of non-content words. The performance of the method heavily depends on the choice of such a stoplist, which should be domain adopted. Therefore, we complement RAKE algorithm with an automatic approach to selecting non-content words, which is based on the statistical properties of term distribution. |
Databáze: |
arXiv |
Externí odkaz: |
|