Understanding Graph Structure of Wikipedia for Query Expansion

Autor: Guisado-Gámez, Joan, Prat-Pérez, Arnau
Rok vydání: 2015
Předmět:
Druh dokumentu: Working Paper
DOI: 10.1145/2764947.2764953
Popis: Knowledge bases are very good sources for knowledge extraction, the ability to create knowledge from structured and unstructured sources and use it to improve automatic processes as query expansion. However, extracting knowledge from unstructured sources is still an open challenge. In this respect, understanding the structure of knowledge bases can provide significant benefits for the effectiveness of such purpose. In particular, Wikipedia has become a very popular knowledge base in the last years because it is a general encyclopedia that has a large amount of information and thus, covers a large amount of different topics. In this piece of work, we analyze how articles and categories of Wikipedia relate to each other and how these relationships can support a query expansion technique. In particular, we show that the structures in the form of dense cycles with a minimum amount of categories tend to identify the most relevant information.
Databáze: arXiv