A system for automatic English text expansion

Autor: Ehud Reiter, Jonathan Juncal-Martínez, Milagros Fernández-Gavilanes, Enrique Costa-Montenegro, Francisco J. González-Castaño, Silvia García-Méndez
Jazyk: angličtina
Rok vydání: 2019
Předmět:
General Computer Science
Computer science
media_common.quotation_subject
02 engineering and technology
computer.software_genre
Lexicon
Augmentative and alternative communication
Field (computer science)
Set (abstract data type)
0202 electrical engineering
electronic engineering
information engineering

General Materials Science
Electrical and Electronic Engineering
media_common
Grammar
business.industry
General Engineering
Natural language generation
020206 networking & telecommunications
sentence planning
natural language generation
5701.04 Lingüística Informatizada
ComputingMethodologies_DOCUMENTANDTEXTPROCESSING
surface realiser
020201 artificial intelligence & image processing
Artificial intelligence
lcsh:Electrical engineering. Electronics. Nuclear engineering
business
computer
text expansion
lcsh:TK1-9971
Natural language processing
Zdroj: IEEE Access, Vol 7, Pp 123320-123333 (2019)
Popis: We present an automatic text expansion system to generate English sentences, which performs automatic Natural Language Generation (NLG) by combining linguistic rules with statistical approaches. Here, “automatic” means that the system can generate coherent and correct sentences from a minimum set of words. From its inception, the design is modular and adaptable to other languages. This adaptability is one of its greatest advantages. For English, we have created the highly precise aLexiE lexicon with wide coverage, which represents a contribution on its own. We have evaluated the resulting NLG library in an Augmentative and Alternative Communication (AAC) proof of concept, both directly (by regenerating corpus sentences) and manually (from annotations) using a popular corpus in the NLG field. We performed a second analysis by comparing the quality of text expansion in English to Spanish, using an ad-hoc Spanish-English parallel corpus. The system might also be applied to other domains such as report and news generation. Ministerio de Economía, Industria y Competitividad | Ref. TEC2016-76465-C2-2-R Xunta de Galicia | Ref. GRC-2018/53 Xunta de Galicia | Ref. ED341D R2016/012 University of Aberdeen
Databáze: OpenAIRE