Text and sentence histories for analyzing the production of multi-word structures

Autor: Mahlow, Cerstin
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Popis: Invited talk at Université Sorbonne nouvelle in the project "ANR Pro-TEXT – Les processus de textualisation: modélisations linguistiques, psycholinguistiques et d’apprentissage automatique" https://pro-text.huma-num.fr at Clesthia, Université Sorbonne nouvelle (USN), CERCA, CNRS – Université de Poitiers (UdP), and LIPN, CNRS – Université Paris Nord (UPN) We are currently working on THETool (Text History Extraction Tool). The goal is to explore writing on a structural level (syntax in the broadest sense). We have two concrete goals for our research: (a) on a theoretical level: How do writers produce (i.e., write and revise, incl. deletion) multi-word discourse structures like: - argumentative elements ("on the one hand" -- "on the other hand") - hedges ("so to speak") - booster ("in fact") (b) on a practical level: How to support writers to use those structures effectively in academic writing (general use, variation, etc.)? With THETool we can parse keystroke-logging data and create text and sentence histories for a particular writing session. Sentence histories cover all events relevant for a particular sentence, so one can follow what the writer did even when they came back to a sentence several times. As we are interested in multi-word structures, we introduce the notion of relevant edits. This allows us to filter production and editing we are not interested in. Here that would be edits on the word level like corrections for potential typos and spelling errors. In this talk I will present the architecture and functioning of THETool and some first results for German writing sessions.
Databáze: OpenAIRE