Concordance Comparison as a Means of Assembling Local Grammars

Autor: Eric Laporte, Juliana P. C. Pirovani, Elias de Oliveira
Přispěvatelé: Departamento de Computação, Universidade Federal do Espirito Santo (UFES), Departamento de Arquivologia, Laboratoire d'Informatique Gaspard-Monge (LIGM), Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM), Aline Villavicencio, Moreira Viviane, Alberto Abad, Helena Caseli, Pablo Gamallo, Carlos Ramisch, Hugo Ricardo Gonçalo Oliveira, Gustavo Henrique Paetzold, Université Paris-Est Marne-la-Vallée (UPEM)-École des Ponts ParisTech (ENPC)-ESIEE Paris-Fédération de Recherche Bézout-Centre National de la Recherche Scientifique (CNRS)
Jazyk: angličtina
Rok vydání: 2018
Předmět:
Zdroj: Computational Processing of the Portuguese Language. 13th International Conference, PROPOR 2018, Canela, Brazil, September 24–26, 2018, Proceedings
Aline Villavicencio; Moreira Viviane; Alberto Abad; Helena Caseli; Pablo Gamallo; Carlos Ramisch; Hugo Ricardo Gonçalo Oliveira; Gustavo Henrique Paetzold. Computational Processing of the Portuguese Language. 13th International Conference, PROPOR 2018, Canela, Brazil, September 24–26, 2018, Proceedings, 11122, Springer, pp.57-65, 2018, Lecture Notes in Artificial Intelligence, 978-3-319-99721-6. ⟨10.1007/978-3-319-99722-3_6⟩
Lecture Notes in Computer Science ISBN: 9783319997216
PROPOR
DOI: 10.1007/978-3-319-99722-3_6⟩
Popis: International audience; Named Entity Recognition for person names is an important but non-trivial task in information extraction. This article uses a tool that compares the concordances obtained from two local grammars (LG) and highlights the differences. We used the results as an aid to select the best of a set of LGs. By analyzing the comparisons, we observed relationships of inclusion, intersection and disjunction within each pair of LGs, which helped us to assemble those that yielded the best results. This approach was used in a case study on extraction of person names from texts written in Portuguese. We applied the enhanced grammar to the Gold Collection of the Second HAREM. The F-Measure obtained was 76.86, representing a gain of 6 points in relation to the state-of-the-art for Portuguese.
Databáze: OpenAIRE