On the Role of Low-Level Linguistic Tasks for Reading Time Prediction

Autor: Abdellah Fourtassi, Franck Dary, Alexis Nasr
Přispěvatelé: Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Aix Marseille Université (AMU), Traitement Automatique du Langage Ecrit et Parlé (TALEP), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Dary, Franck
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Cognitive model
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI]
Computer science
media_common.quotation_subject
[INFO.INFO-TT] Computer Science [cs]/Document and Text Processing
surprisal
computer.software_genre
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
Reading (process)
reading time
media_common
Parsing
Lemmatisation
Lexical analysis
[SDV.NEU.SC]Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC]/Cognitive Sciences
Cognition
[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG]
cognitive modeling
Linguistics
Comprehension
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
entropy
computer
Sentence
[SDV.NEU.SC] Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC]/Cognitive Sciences
Zdroj: Proceedings of the Annual Meeting of the Cognitive Science Society, 43(43)
43rd Annual Meeting of the Cognitive Science Society
43rd Annual Meeting of the Cognitive Science Society, Jul 2021, Vienna, Austria. pp.452
Popis: International audience; It has been shown that complexity metrics, computed by a syntactic parser, is a predictor of human reading time, which is an approximation of human sentence comprehension difficulty. Nevertheless, parsers usually take as input sentences that have already been processed or even manually annotated. We propose to study a more realistic scenario, where the various processing levels (tokenization, PoS and morphology tagging, lemmatization, syntactic parsing and sentence segmentation) are predicted incrementally from raw text. To this end, we propose a versatile modeling framework, we call the Reading Machine, that performs all such linguistic tasks and allows to incorporate cognitive constrains such as incrementality. We illustrate the behavior of this setting through a case study where we test the hypothesis that the complexity metrics computed at different processing levels predicts human reading difficulty, and that when cognitive constraints are applied to the machine (e.g., incrementality), it yields better predictions.
Databáze: OpenAIRE