The impact of sections headings on the document retrieval

Autor: Jean-Marie Pinon, Okba Kazar, Belkacem Abdelli
Rok vydání: 2014
Předmět:
Zdroj: ICDIM
DOI: 10.1109/icdim.2014.6991398
Popis: With online publications, the current Web has become the largest source of digital documents, often stored in HTML, XML, PDF or DOC. Among the features of documents, note especially their logical structure, which represents their components such as chapters, sections, paragraphs, the document title, chapter titles, sections, etc. The section headings are meaningful; they are a good indicator of the content of paragraphs. For this reason we pay particular attention to these titles during the indexing process and research. Our objective is to provide relevant access to digital documents, by the process of all sections titles to take advantage of their mining and importance in the research process. Experiments on a large corpus, INEX 2009 showeffectiveness of our proposition an improvement in the precision of the results in IR
Databáze: OpenAIRE