Punjabi Stemmer Using Punjabi WordNet Database
Autor: | Rajeev Puri, Vishal Goyal, R. P. S. Bedi |
---|---|
Rok vydání: | 2015 |
Předmět: | |
Zdroj: | Indian Journal of Science and Technology. 8 |
ISSN: | 0974-5645 0974-6846 |
Popis: | Stemming is used as a pre-processing phase in the information retrieval tasks. The stemming process produces linguistically normalized text, which helps in improving the results of information retrieval tasks. In this paper, a revised suffix removal approach with extended set of stripping rules has been discussed for creating a Punjabi language Stemming tool. The stemming algorithm discussed in this paper uses regular expressions for finding suffix matches. The WordNet* database is used here for improving the stemming results. |
Databáze: | OpenAIRE |
Externí odkaz: |