Automatic Alignment and Annotation Projection for Literary Texts

Autor: Ines Rehbein, Uli Steinbach
Rok vydání: 2019
Předmět:
Zdroj: LaTeCH@NAACL-HLT
DOI: 10.18653/v1/w19-2505
Popis: This paper presents a modular NLP pipeline for the creation of a parallel literature corpus, followed by annotation transfer from the source to the target language. The test case we use to evaluate our pipeline is the automatic transfer of quote and speaker mention annotations from English to German. We evaluate the different components of the pipeline and discuss challenges specific to literary texts. Our experiments show that after applying a reasonable amount of semi-automatic postprocessing we can obtain high-quality aligned and annotated resources for a new language.
Databáze: OpenAIRE