Dataset for comparable evaluation of machine translation between 11 South African languages

Autor: Martin Puttkammer, Cindy A. McKellar
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: Data in Brief, Vol 29, Iss, Pp-(2020)
Data in Brief
ISSN: 2352-3409
Popis: This data article describes the Autshumato machine translation evaluation set. The evaluation set contains data that can be used to evaluate machine translation systems between any of the 11 official South African languages. The dataset is parallel with four reference translations available for each of the following languages: Afrikaans, English, isiNdebele, isiXhosa, isiZulu, Sepedi, Sesotho, Setswana, Siswati, Tshivenḓa and Xitsonga. Keywords: Machine translation, Automatic evaluation, Natural language processing, Human language technology
Databáze: OpenAIRE