Leveraging Translation For Optimal Recall: Tailoring LLM Personalization With User Profiles

Autor: Ravichandran, Karthik, Gomasta, Sarmistha Sarna
Rok vydání: 2024
Předmět:
Druh dokumentu: Working Paper
Popis: This paper explores a novel technique for improving recall in cross-language information retrieval (CLIR) systems using iterative query refinement grounded in the user's lexical-semantic space. The proposed methodology combines multi-level translation, semantic embedding-based expansion, and user profile-centered augmentation to address the challenge of matching variance between user queries and relevant documents. Through an initial BM25 retrieval, translation into intermediate languages, embedding lookup of similar terms, and iterative re-ranking, the technique aims to expand the scope of potentially relevant results personalized to the individual user. Comparative experiments on news and Twitter datasets demonstrate superior performance over baseline BM25 ranking for the proposed approach across ROUGE metrics. The translation methodology also showed maintained semantic accuracy through the multi-step process. This personalized CLIR framework paves the path for improved context-aware retrieval attentive to the nuances of user language.
Comment: This is just an initial idea and it's implementation. The results are computed for the first 100 data points. Detailed results will be published with the actual paper
Databáze: arXiv