Term recognition using corpora from different fields

Autor: Hitoshi Isahara, Masaki Murata, Kiyotaka Uchimoto, Satoshi Sekine, Hiromi Ozaku
Rok vydání: 2000
Předmět:
Zdroj: Terminology. 6:233-256
ISSN: 1569-9994
0929-9971
DOI: 10.1075/term.6.2.07uch
Popis: We present a system used in the term recognition competition, one of the subtasks covered by the NTCIR tmrec group, and we evaluate its term recognition results. We regard that terms are lexical items, characteristic of a field, which have the following three features: (1) they appear frequently in documents of the target field; (2) they are not common words in the target field; and (3) they appear less frequently in the corpora of other fields. Our system uses corpora from different fields and uses these features to recognize terms. We then analyze the differences between our term list and the manual candidates list produced by the NTCIR tmrec group. In this article we identify features that are important for automatic term recognition. Furthermore, through comparative experiments based on manual candidates, we establish the importance of indices in extracting a term list.
Databáze: OpenAIRE