YAD: Leveraging T5 for Improved Automatic Diacritization of Yor\`ub\'a Text
Autor: | Olawole, Akindele Michael, Alabi, Jesujoba O., Sakpere, Aderonke Busayo, Adelani, David I. |
---|---|
Rok vydání: | 2024 |
Předmět: | |
Druh dokumentu: | Working Paper |
Popis: | In this work, we present Yor\`ub\'a automatic diacritization (YAD) benchmark dataset for evaluating Yor\`ub\'a diacritization systems. In addition, we pre-train text-to-text transformer, T5 model for Yor\`ub\'a and showed that this model outperform several multilingually trained T5 models. Lastly, we showed that more data and larger models are better at diacritization for Yor\`ub\'a Comment: Accepted at AfricaNLP Workshop at ICLR 2024 |
Databáze: | arXiv |
Externí odkaz: |