Abstrakt: |
ABSTRACT:The Historical Thesaurus of English (HT) categorizes the English vocabulary into meaning-based categories containing synonymous or near-synonymous lexical items. Its extensive hierarchical categorization scheme (c. 235,000 categories) is, however, unwieldy for new and more casual users. Additionally, it can be too fine-grained for implementation in natural language processing tools, leading them to produce over-specified results where humans recognize ambiguity. This article outlines the reasoning and methodology behind the creation of a truncated semantic hierarchy, known by the HT editors as the thematic category set. Development of the category set was guided by evaluation of which concepts are too technical or too general to be relevant to the average user, with cut-offs imposed on the hierarchy at an intermediate human-scale level. The result is a hybrid category set which combines organisation by synonymy at higher levels of abstraction, and organisation by something approaching a conceptual field at lower, less abstract levels. |