Term recognition using corpora from different fields
Autor: | Hitoshi Isahara, Masaki Murata, Kiyotaka Uchimoto, Satoshi Sekine, Hiromi Ozaku |
---|---|
Rok vydání: | 2000 |
Předmět: | |
Zdroj: | Terminology. 6:233-256 |
ISSN: | 1569-9994 0929-9971 |
DOI: | 10.1075/term.6.2.07uch |
Popis: | We present a system used in the term recognition competition, one of the subtasks covered by the NTCIR tmrec group, and we evaluate its term recognition results. We regard that terms are lexical items, characteristic of a field, which have the following three features: (1) they appear frequently in documents of the target field; (2) they are not common words in the target field; and (3) they appear less frequently in the corpora of other fields. Our system uses corpora from different fields and uses these features to recognize terms. We then analyze the differences between our term list and the manual candidates list produced by the NTCIR tmrec group. In this article we identify features that are important for automatic term recognition. Furthermore, through comparative experiments based on manual candidates, we establish the importance of indices in extracting a term list. |
Databáze: | OpenAIRE |
Externí odkaz: |