Automatic Alignment and Annotation Projection for Literary Texts
Autor: | Ines Rehbein, Uli Steinbach |
---|---|
Rok vydání: | 2019 |
Předmět: |
050101 languages & linguistics
business.industry Computer science 05 social sciences 02 engineering and technology computer.software_genre Pipeline (software) language.human_language German Projection (relational algebra) Annotation 0202 electrical engineering electronic engineering information engineering language 020201 artificial intelligence & image processing 0501 psychology and cognitive sciences Artificial intelligence business computer Natural language processing |
Zdroj: | LaTeCH@NAACL-HLT |
DOI: | 10.18653/v1/w19-2505 |
Popis: | This paper presents a modular NLP pipeline for the creation of a parallel literature corpus, followed by annotation transfer from the source to the target language. The test case we use to evaluate our pipeline is the automatic transfer of quote and speaker mention annotations from English to German. We evaluate the different components of the pipeline and discuss challenges specific to literary texts. Our experiments show that after applying a reasonable amount of semi-automatic postprocessing we can obtain high-quality aligned and annotated resources for a new language. |
Databáze: | OpenAIRE |
Externí odkaz: |