Detection of Reading Absorption in User-Generated Book Reviews:Resources Creation and Evaluation

Autor: Lendvai, Piroska, Darányi, Sándor, Geng, Christian, Kuijper, Moniek, Lopez de Lacalle, Oier, Mensonides, Jean-Christophe, Rebora, Simone, Reichel, Uwe D.
Přispěvatelé: University of Basel (Unibas), University of Boras, Zentrum für Allgemeine Sprachwissenschaft [Berlin] (ZAS), Bundesministerium für Bildung und Forschung-Deutsche Forschungsgemeinschaft - German Research Foundation (DFG), University of the Basque Country/Euskal Herriko Unibertsitatea (UPV/EHU), Informatique, Image, Intelligence Artificielle (I3A), Laboratoire de Génie Informatique et d'Ingénierie de Production (LGI2P), IMT - MINES ALES (IMT - MINES ALES), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT)-IMT - MINES ALES (IMT - MINES ALES), Institut Mines-Télécom [Paris] (IMT)-Institut Mines-Télécom [Paris] (IMT), Research Institute for Linguistics [Budapest], Hungarian Academy of Sciences (MTA)
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: LREC 2020-12th Conference on Language Resources and Evaluation
LREC 2020-12th Conference on Language Resources and Evaluation, 2020, Marseille, France. pp.4835-4841
Popis: Le congrès ne s'est pas tenu physiquement aux dates et lieu prévus mais les proceedings ont été publiés et diffusés en ligne.; International audience; To detect how and when readers are experiencing engagement with a literary work, we bring together empirical literary studies andlanguage technology via focusing on the affective state of absorption. The goal of our resource development is to enable the detectionof different levels of reading absorption in millions of user-generated reviews hosted on social reading platforms. We present a corpusof social book reviews in English that we annotated with reading absorption categories. Based on these data, we performed supervised,sentence level, binary classification of the explicit presence vs. absence of the mental state of absorption. We compared the performancesof classical machine learners where features comprised sentence representations obtained from a pretrained embedding model (UniversalSentence Encoder) vs. neural classifiers in which sentence embedding vector representations are adapted or fine-tuned while trainingfor the absorption recognition task. We discuss the challenges in creating the labeled data as well as the possibilities for releasing a benchmark corpus.
Databáze: OpenAIRE