Punjabi Stemmer Using Punjabi WordNet Database

Autor: Rajeev Puri, Vishal Goyal, R. P. S. Bedi
Rok vydání: 2015
Předmět:
Zdroj: Indian Journal of Science and Technology. 8
ISSN: 0974-5645
0974-6846
Popis: Stemming is used as a pre-processing phase in the information retrieval tasks. The stemming process produces linguistically normalized text, which helps in improving the results of information retrieval tasks. In this paper, a revised suffix removal approach with extended set of stripping rules has been discussed for creating a Punjabi language Stemming tool. The stemming algorithm discussed in this paper uses regular expressions for finding suffix matches. The WordNet* database is used here for improving the stemming results.
Databáze: OpenAIRE