Dataset for comparable evaluation of machine translation between 11 South African languages
Autor: | Martin Puttkammer, Cindy A. McKellar |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
Machine translation
Computer science computer.software_genre lcsh:Computer applications to medicine. Medical informatics Arts and Humanity Set (abstract data type) 03 medical and health sciences 0302 clinical medicine Evaluation of machine translation lcsh:Science (General) 030304 developmental biology 0303 health sciences Multidisciplinary business.industry Natural language processing Human language technology Languages of Africa Language technology lcsh:R858-859.7 Artificial intelligence business computer Automatic evaluation 030217 neurology & neurosurgery lcsh:Q1-390 |
Zdroj: | Data in Brief, Vol 29, Iss, Pp-(2020) Data in Brief |
ISSN: | 2352-3409 |
Popis: | This data article describes the Autshumato machine translation evaluation set. The evaluation set contains data that can be used to evaluate machine translation systems between any of the 11 official South African languages. The dataset is parallel with four reference translations available for each of the following languages: Afrikaans, English, isiNdebele, isiXhosa, isiZulu, Sepedi, Sesotho, Setswana, Siswati, Tshivenḓa and Xitsonga. Keywords: Machine translation, Automatic evaluation, Natural language processing, Human language technology |
Databáze: | OpenAIRE |
Externí odkaz: |