Is Simple English Wikipedia As Simple And Easy-to-Understand As We Expect It To Be?

Autor: Sergiu Nisioi, Daniel Ibanez, Sanja Štajner
Rok vydání: 2020
Předmět:
Zdroj: DSAI
DOI: 10.1145/3439231.3439263
Popis: Conceptual complexity of a written text plays an important role in maintaining reader's interest in reading it. Therefore, automatic text simplification systems should, apart from considering lexical and syntactic complexity of a text, also consider the conceptual complexity. In this study, we analyze and compare two widely used English text simplification corpora, one professionally produced (Newsela) and the other collaboratively made by amateurs and enthusiasts (English Wikipedia–Simple English Wikipedia), focusing on 19 conceptual complexity features. The results indicated that simplification operations made during the production of Simple English Wikipedia in many cases do not follow the patterns of the professionally simplified corpora, thus casting doubts on adequacy of using Simple English Wikipedia as training material for automatic text simplification systems.
Databáze: OpenAIRE