An Efficient Indexing and Searching Technique for Information Retrieval for Urdu Language

Autor: Qureshi, Muhammad Mudassar, Shoaib, Muhammad, Kalsoom
Rok vydání: 2021
Předmět:
Druh dokumentu: Working Paper
Popis: Indexing techniques are used to improve retrieval of data in response to certain search condition. Inverted files are mostly used for creating indexes. This paper proposes indexing technique for Urdu language. Language processing step in Index creation is different for a particular language. We discuss index creation steps specifically for Urdu language. We explore morphological rules for Urdu language and implement these rules to create Urdu stemmer. We implement our proposed technique with different implementations and compare results. We suggest that indexes should be created without stop words and also index file should be an order index file.
Databáze: arXiv