Zachycení výstavby textu v Pražském závislostním korpusu = Annotation of discourse phenomena in the Prague Dependency Treebank.

Autor: Zikánová, Šárka
Další autoři:
Jazyk: čeština
Druh dokumentu: Non-fiction
ISSN: 0037-7031
Abstrakt: Abstract: Language corpora annotation schemes cover various layers of sentence description nowadays - from morphology to semantics. Annotation projects concerning phenomena beyond the sentence boundaries, however, started to attract the attention of corpus linguists only recently. In the present contribution, we describe a unified approach to analysis of discourse phenomena, aimed and developed for a large-scale annotation of Czech empirical data of the Prague Dependency Treebank. This approach is based on two fundamental pillars: (i) it exploits the results of one of the first complex schemes for discourse annotation proposed and realized in the Penn Discourse Treebank for English; (ii) it follows the Praguian Functional Generative Description and treebanking tradition, taking advantage of the tectogrammatical (underlying) layer of sentence analysis and extending it to a full discourse-level description. Our analysis concentrates on two major aspects of discourse coherence: (i) on discourse relations (semantic relations between discourse segments) and discourse connectives as their lexical anchors; and (ii) on coreference and the so-called bridging anaphora. We present a detailed description of the annotation scheme and procedure, address individual problematic issues and offer basic corpus statistics and annotation evaluation.
Databáze: Katalog Knihovny AV ČR