Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Zemour, Eliott"'
Deploying language models (LMs) necessitates outputs to be both high-quality and compliant with safety guidelines. Although Inference-Time Guardrails (ITG) offer solutions that shift model output distributions towards compliance, we find that current
Externí odkaz:
http://arxiv.org/abs/2407.16318
Autor:
Sun, Albert Yu, Zemour, Eliott, Saxena, Arushi, Vaidyanathan, Udith, Lin, Eric, Lau, Christian, Mugunthan, Vaikkunth
Machine learning practitioners often fine-tune generative pre-trained models like GPT-3 to improve model performance at specific tasks. Previous works, however, suggest that fine-tuned machine learning models memorize and emit sensitive information f
Externí odkaz:
http://arxiv.org/abs/2307.16382