Natural language processing based identification of Related Short Forum Posts Through Knowledge Based Conceptualization

Autor: Ajithkumar. A. K, J. C. Miraclin Joyce Pamila, R.Senthamil Selvi
Rok vydání: 2021
Předmět:
Zdroj: 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS).
DOI: 10.1109/icais50930.2021.9396051
Popis: Online communities collaborate and users share their views using online forums. The experience and ideas shared by the users in the forum are rich but finding relevant forum posts is laborious and frustrating. This research is targeted towards comparing a post at hand to find forum posts related to it. The conventional methods for identifying text similarity are not as efficient as they do not conceptualize the short text and lead to poor performance in finding related content. This paper proposes a novel scheme for the identification of related short forum posts in discussion forums. Contrary to the use of fixed vocabulary sets in the existing schemes, the proposed method uses distinct words in the forum post pair to form a joint word set dynamically. The knowledge base is used for deriving a raw semantic vector for each forum post. Further, the two semantic vectors are used for the computation of semantic similarity. The proposed framework uses inverted indexing to improve the efficiency of retrieving relevant forum posts by reducing the search space with synonyms of the forum post at hand. It is proven to be efficient in finding related forum posts in discussion forums with a recall of 90% through a set of tests conducted. It is also observed that precision can be improved with the Named Entity Recognition method.
Databáze: OpenAIRE