Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Chimoto, Everlyn Asiko"'
Autor:
Chimoto, Everlyn Asiko, Gala, Jay, Ahia, Orevaoghene, Kreutzer, Julia, Bassett, Bruce A., Hooker, Sara
Neural Machine Translation models are extremely data and compute-hungry. However, not all data points contribute equally to model training and generalization. Data pruning to remove the low-value data points has the benefit of drastically reducing th
Externí odkaz:
http://arxiv.org/abs/2405.19462
Autor:
Jacobs, Christiaan, Rakotonirina, Nathanaël Carraz, Chimoto, Everlyn Asiko, Bassett, Bruce A., Kamper, Herman
We consider hate speech detection through keyword spotting on radio broadcasts. One approach is to build an automatic speech recognition (ASR) system for the target low-resource language. We compare this to using acoustic word embedding (AWE) models
Externí odkaz:
http://arxiv.org/abs/2306.00410
Language-agnostic sentence embeddings generated by pre-trained models such as LASER and LaBSE are attractive options for mining large datasets to produce parallel corpora for low-resource machine translation. We test LASER and LaBSE in extracting bit
Externí odkaz:
http://arxiv.org/abs/2211.00046
Active learning aims to deliver maximum benefit when resources are scarce. We use COMET-QE, a reference-free evaluation metric, to select sentences for low-resource neural machine translation. Using Swahili, Kinyarwanda and Spanish for our experiment
Externí odkaz:
http://arxiv.org/abs/2210.15696