Progressive Joint Framework for Chinese Question Entity Discovery and Linking With Question Representations
Autor: | Wancheng Ni, Ziqi Lin, Yiping Yang, Haidong Zhang |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
Information retrieval
General Computer Science Computer science Process (engineering) 010102 general mathematics General Engineering joint method 02 engineering and technology Entity discovery and linking 01 natural sciences Ranking (information retrieval) Knowledge-based systems Entity linking question representation model 0202 electrical engineering electronic engineering information engineering Feature (machine learning) 020201 artificial intelligence & image processing General Materials Science information extraction lcsh:Electrical engineering. Electronics. Nuclear engineering 0101 mathematics natural language processing lcsh:TK1-9971 |
Zdroj: | IEEE Access, Vol 7, Pp 146282-146300 (2019) |
ISSN: | 2169-3536 |
Popis: | Chinese question entity discovery and linking (QEDL) may encounter short texts and small-scale annotated datasets, which may invalidate certain machine learning algorithms. In this paper, we propose a progressive joint framework for Chinese QEDL, which leverages the mutual dependency information of these two tasks to enhance the performance with each other. The framework uses the candidate entity generation (CEG) of entity linking to iteratively augment the overall process of entity discovery that consists of mention generation, filtering and merging modules. In mention generation module, to reduce the hand-crafted effort of the rule-based entity discovery, we develop a question representation method to generate domain-independent entity discovery rules, and use CEG to check the extracted mentions in priority order. This module can embed extracted mentions into other entity discovery methods as one feature or as extra mentions to alleviate insufficiencies of annotated datasets. The mentions filtering module leverages the joint features of extracted mentions and CEG’s entities to build a voting model and filter out low-confidence mentions. Moreover, the mentions merging module merges different patterns’ mention-entity pairs and check their corresponding candidate entities with CEG. During entity linking, we incorporate the joint features of questions, extracted mentions and CEG’s entities into a ranking model for entity disambiguation. Finally, we conduct experiments on two real datasets and compare our approach with other state-of-the-art methods. The results illustrate that the proposed framework can reduce error accumulation and flexibly combine different entity discovery methods, which significantly improves the performance on small-scale datasets. |
Databáze: | OpenAIRE |
Externí odkaz: |