Výsledky vyhledávání - "Thakur, Aman Singh"

Report

Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges

Autor: Thakur, Aman Singh, Choudhary, Kartik, Ramayapally, Venkat Srinik, Vaidyanathan, Sankaran, Hupkes, Dieuwke

Offering a promising solution to the scalability challenges associated with human evaluation, the LLM-as-a-judge paradigm is rapidly gaining traction as an approach to evaluating large language models (LLMs). However, there are still many open questi

Externí odkaz: http://arxiv.org/abs/2406.12624

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání