Autor: |
XINYU ZHANG1 xinyucrystina.zhang@uwaterloo.ca, KELECHI OGUEJI1 kelechi.ogueji@uwaterloo.ca, XUEGUANG MA1 x93ma@uwaterloo.ca, JIMMY LIN1 jimmylin@uwaterloo.ca |
Zdroj: |
ACM Transactions on Information Systems. Mar2024, Vol. 42 Issue 2, p1-33. 33p. |
Abstrakt: |
The article presents best practices for training multilingual dense retrieval models using transformer-based bi-encoders. It addresses scenarios with and without training data in the target language. It provides guidance on multi-stage fine-tuning, cross-lingual transfer, and the use of in-domain and out-of-language data. |
Databáze: |
Library, Information Science & Technology Abstracts |
Externí odkaz: |
|