STR: A GRAPH-BASED TAGGING TECHNIQUE

Autor: Carlos G. Vallejo, Fermín L. Cruz, José A. Troyano, Francisco J. Galán, F. Javier Ortega
Rok vydání: 2011
Předmět:
Zdroj: International Journal on Artificial Intelligence Tools. 20:955-967
ISSN: 1793-6349
0218-2130
DOI: 10.1142/s0218213011000437
Popis: This paper presents the ideas, experiments and specifications related to the Supervised TextRank (STR) technique, a word tagging method based on the TextRank algorithm. The main innovation of STR technique is the use of a graph-based ranking algorithm similar to PageRank in a supervised fashion, gathering the information needed to build the graph representations of the text from a tagged corpus. We also propose a flexible graph specification language that allows to easily experiment with multiple configurations for the topology of the graph and for the information associated to the nodes and the edges. We have carried experiments in the Part-Of-Speech task, a common tagging problem in Natural Language Processing. In our best result we have achieved a precision of 96.16%, at the same level of the best tagging tools.
Databáze: OpenAIRE