Elements of a National SemanticWeb Infrastructure--Case Study Finland on the Semantic Web

Autor: Eero Hyvonen, Robin Lindroos, Teppo Kansala, Riikka Henriksson, Matias Frosterus, Jouni Tuominen, Reetta Sinkkila, Jussi Kurki, Kim Viljanen, Eetu Makela, Tomi Kauppinen, Tuukka Ruotsalo, Onni Valkeapaa, Katri Seppala, Osma Suominen, Olli Alm
Rok vydání: 2007
Předmět:
Zdroj: ICSC
Popis: Most research in the field of anaphora or coreference detection has been limited to noun phrase coreference, usually on a restricted set of entities, such as ACE entities. In part, this has been due to the lack of corpus resources tagged with general anaphoric coreference. The OntoNotes project is creating a large-scale, accurate corpus for general anaphoric coreference that covers entities and events not limited to noun phrases or a limited set of entity types. The coreference layer in OntoNotes constitutes one part of a multi-layer, integrated annotation of shallow semantic structure in text. This paper presents an initial model for unrestricted coreference based on this data that uses a machine learning architecture with state-of-the-art features. Significant improvements can be expected from using such cross-layer information for training predictive models. This paper describes the coreference annotation in OntoNotes, presents the baseline model, and provides an analysis of the contribution of this new resource in the context of recent MUC and ACE results.
Databáze: OpenAIRE