Popis: |
With online publications, the current Web has become the largest source of digital documents, often stored in HTML, XML, PDF or DOC. Among the features of documents, note especially their logical structure, which represents their components such as chapters, sections, paragraphs, the document title, chapter titles, sections, etc. The section headings are meaningful; they are a good indicator of the content of paragraphs. For this reason we pay particular attention to these titles during the indexing process and research. Our objective is to provide relevant access to digital documents, by the process of all sections titles to take advantage of their mining and importance in the research process. Experiments on a large corpus, INEX 2009 showeffectiveness of our proposition an improvement in the precision of the results in IR |