Bootstrapping and Collaboratively Enriching the Italian Domain WordNet through the WiKyoto Knowledge Editor
Autor: | Ronzano F., Monachini M., Marchetti A., Tesconi M., Calzolari N. |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2010 |
Předmět: | |
Zdroj: | Multilinguality and Interoperability in Language Processing with Emphasis on Romanian, edited by Tufis D.; Forascu I., pp. 181–208. Bucharest: Romanian Academy Publishing House, 2010 info:cnr-pdr/source/autori:Ronzano F.; Monachini M.; Marchetti A.; Tesconi M.; Calzolari N./titolo:Bootstrapping and Collaboratively Enriching the Italian Domain WordNet through the WiKyoto Knowledge Editor/titolo_volume:Multilinguality and Interoperability in Language Processing with Emphasis on Romanian/curatori_volume:Tufis D.; Forascu I./editore: /anno:2010 |
Popis: | Enhancing the development of multilingual resources is of utmost importance for use in computer applications. The need of ever growing resources for effective multilingual content processing has given impulse to a radical change in the perspective of language resource (LR) creation, structuring, exploitation and maintenance. The Web has played a key role in this process: indeed the possibility to access growing amounts of structured and unstructured data as well as the ease of creating and sharing contents between distributed communities of users have strongly affected the methodologies and techniques to bootstrap, enrich and access LRs. From static knowledge bases usually created and maintained by groups of experts and tailored to the specific exploitation contexts, LRs have turned into dynamic repositories of linguistic knowledge. Their content is usually easily accessible over the Web and often exploited aggregated and optimized on-the-fly by on-line information mining services. In this context, the adoption of standardized data formats to facilitate interoperability and data exchange is essential. Moreover, the creation and maintenance of these resources has taken great advantage from the possibility to harvest Web data in order to bootstrap or enrich them. Several new frameworks have been proposed to support access, search, integration and interoperability of "new generation" LRs. Wide distributed communities of Web users are more and more directly or indirectly involved in keeping language resources updated or in extending them. After a brief description of modern LRs, we focus our attention on two essential issues involving them: the need for standard formats that support interoperability in a distributed Web context and the possibility for the Web communities to collaboratively maintain and enrich these resources. In particular, we present the Italian WordNet (IWN) and its exploitation in the context of the KYOTO Project, as a real-world scenario where standardization, interlinking, enrichment as well as collaborative editing are put into practice. |
Databáze: | OpenAIRE |
Externí odkaz: |