Arukikata Travelogue Dataset with Geographic Entity Mention, Coreference, and Link Annotation

Autor: Higashiyama, Shohei, Ouchi, Hiroki, Teranishi, Hiroki, Otomo, Hiroyuki, Ide, Yusuke, Yamamoto, Aitaro, Shindo, Hiroyuki, Matsuda, Yuki, Wakamiya, Shoko, Inoue, Naoya, Yamada, Ikuya, Watanabe, Taro
Rok vydání: 2023
Předmět:
Druh dokumentu: Working Paper
Popis: Geoparsing is a fundamental technique for analyzing geo-entity information in text. We focus on document-level geoparsing, which considers geographic relatedness among geo-entity mentions, and presents a Japanese travelogue dataset designed for evaluating document-level geoparsing systems. Our dataset comprises 200 travelogue documents with rich geo-entity information: 12,171 mentions, 6,339 coreference clusters, and 2,551 geo-entities linked to geo-database entries.
Databáze: arXiv