Výsledky vyhledávání - "evaluation practices"

Report

Really Doing Great at Model Evaluation for CATE Estimation? A Critical Consideration of Current Model Evaluation Practices in Treatment Effect Estimation

Autor: Souto, Hugo Gobato, Neto, Francisco Louzada

This paper critically examines current methodologies for evaluating models in Conditional and Average Treatment Effect (CATE/ATE) estimation, identifying several key pitfalls in existing practices. The current approach of over-reliance on specific me

Externí odkaz: http://arxiv.org/abs/2409.05161

Zobrazit plný text záznamu

Report

Automatic Metrics in Natural Language Generation: A Survey of Current Evaluation Practices

Autor: Schmidtová, Patrícia, Mahamood, Saad, Balloccu, Simone, Dušek, Ondřej, Gatt, Albert, Gkatzia, Dimitra, Howcroft, David M., Plátek, Ondřej, Sivaprasad, Adarsa

Automatic metrics are extensively used to evaluate natural language processing systems. However, there has been increasing focus on how they are used and reported by practitioners within the field. In this paper, we have conducted a survey on the use

Externí odkaz: http://arxiv.org/abs/2408.09169

Zobrazit plný text záznamu

Akademický článek

TOWARDS OPTIMAL PROJECT MANAGEMENT: INFLUENCE OF MONITORING AND EVALUATION PRACTICES ON PROJECT OUTCOMES IN HIV SERVICE PROVISION IN KENYA.

Autor: Jacinta, Mutie Mwikali¹ mutiemj@outlook.com, Wambugu, Lydia² lydiah.nyaguthii@uonbi.ac.ke, Nyonje, Raphael³ raphael.nyonje@uonbi.ac.ke, Kikwatha, Reuben⁴ kikwathar@uonbi.ac.ke

Publikováno v: International Journal of Professional Business Review (JPBReview). 2024, Vol. 9 Issue 9, p1-18. 18p.

Zobrazit plný text záznamu

Report

On the Evaluation Practices in Multilingual NLP: Can Machine Translation Offer an Alternative to Human Translations?

Autor: Choenni, Rochelle, Rajaee, Sara, Monz, Christof, Shutova, Ekaterina

While multilingual language models (MLMs) have been trained on 100+ languages, they are typically only evaluated across a handful of them due to a lack of available test data in most languages. This is particularly problematic when assessing MLM's po

Externí odkaz: http://arxiv.org/abs/2406.14267

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání