Construction of a Chinese Corpus for Multi-Type Economic Event Relation

Autor: Qizhi Wan, Changxuan Wan, Keli Xiao, Dexi Liu, Qing Liu, Jiangling Deng, Wenkang Luo, Rong Hu
Rok vydání: 2022
Předmět:
Zdroj: ACM Transactions on Asian and Low-Resource Language Information Processing. 21:1-20
ISSN: 2375-4702
2375-4699
DOI: 10.1145/3527240
Popis: We construct a Chinese Economic Event Treebank (CEETB) , focusing on revealing economic and finance events and their relations. Investigating economic event relations will benefit academic research and practice in not just economics but many other scientific areas. The characteristics of economic-related texts (e.g., abundant longer enterprises names and terms) and the Chinese language speciality (e.g., component ellipsis in long sentences) have resulted in challenges in the event relation extraction task. Existing Chinese corpora containing economic event relations mainly focused on finance areas (e.g., the equity market) and only covered a few event types. To support research that may involve economic text analysis in Chinese, our CEETB is constructed following a carefully designed process. First, based on practical and research requirements, we summarize nine different types of event relations and four types of component ellipses in economic texts. Then, an excellent annotation scheme is presented to hyalinize the model, strategy, and process in annotation, followed by statistical analysis and quality evaluation for the CEETB corpus. Finally, to demonstrate the strengths of the constructed corpus in practical applications, we conduct experiments on five SOTA models for event relation extraction.
Databáze: OpenAIRE