Výsledky vyhledávání - "Consistency evaluation"

Report

Regional consistency evaluation and sample size calculation under two MRCTs

Autor: Qing, Kunhai, Ren, Xinru, Xu, Jin

Multi-regional clinical trial (MRCT) has been common practice for drug development and global registration. The FDA guidance "Demonstrating Substantial Evidence of Effectiveness for Human Drug and Biological Products Guidance for Industry" (FDA, 2019

Externí odkaz: http://arxiv.org/abs/2411.15567

Zobrazit plný text záznamu

Report

AXCEL: Automated eXplainable Consistency Evaluation using LLMs

Autor: Sreekar, P Aditya, Verma, Sahil, Chopra, Suransh, Ghazarian, Sarik, Persad, Abhishek, Sadagopan, Narayanan

Large Language Models (LLMs) are widely used in both industry and academia for various tasks, yet evaluating the consistency of generated text responses continues to be a challenge. Traditional metrics like ROUGE and BLEU show a weak correlation with

Externí odkaz: http://arxiv.org/abs/2409.16984

Zobrazit plný text záznamu

Report

Zero-shot Factual Consistency Evaluation Across Domains

Autor: Agarwal, Raunak

This work addresses the challenge of factual consistency in text generation systems. We unify the tasks of Natural Language Inference, Summarization Evaluation, Factuality Verification and Factual Consistency Evaluation to train models capable of eva

Externí odkaz: http://arxiv.org/abs/2408.04114

Zobrazit plný text záznamu

Report

Improving Network Interpretability via Explanation Consistency Evaluation

Autor: Wu, Hefeng, Jiang, Hao, Wang, Keze, Tang, Ziyi, He, Xianghuan, Lin, Liang

While deep neural networks have achieved remarkable performance, they tend to lack transparency in prediction. The pursuit of greater interpretability in neural networks often results in a degradation of their original performance. Some works strive

Externí odkaz: http://arxiv.org/abs/2408.04600

Zobrazit plný text záznamu

Report

Face4RAG: Factual Consistency Evaluation for Retrieval Augmented Generation in Chinese

Autor: Xu, Yunqi, Cai, Tianchi, Jiang, Jiyan, Song, Xierui

Publikováno v: KDD 2024 (oral)

The prevailing issue of factual inconsistency errors in conventional Retrieval Augmented Generation (RAG) motivates the study of Factual Consistency Evaluation (FCE). Despite the various FCE methods proposed earlier, these methods are evaluated on da

Externí odkaz: http://arxiv.org/abs/2407.01080

Zobrazit plný text záznamu

Report

Assessing Image Inpainting via Re-Inpainting Self-Consistency Evaluation

Autor: Chen, Tianyi, Zhang, Jianfu, Hong, Yan, Zhang, Yiyi, Zhang, Liqing

Image inpainting, the task of reconstructing missing segments in corrupted images using available data, faces challenges in ensuring consistency and fidelity, especially under information-scarce conditions. Traditional evaluation methods, heavily dep

Externí odkaz: http://arxiv.org/abs/2405.16263

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Kniha

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Report

Factual Consistency Evaluation of Summarisation in the Era of Large Language Models

Autor: Luo, Zheheng, Xie, Qianqian, Ananiadou, Sophia

Factual inconsistency with source documents in automatically generated summaries can lead to misinformation or pose risks. Existing factual consistency(FC) metrics are constrained by their performance, efficiency, and explainability. Recent advances

Externí odkaz: http://arxiv.org/abs/2402.13758

Zobrazit plný text záznamu

Report

DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models

Autor: Cui, Wendi, Zhang, Jiaxin, Li, Zhuohang, Damien, Lopez, Das, Kamalika, Malin, Bradley, Kumar, Sricharan

Evaluating the quality and variability of text generated by Large Language Models (LLMs) poses a significant, yet unresolved research challenge. Traditional evaluation methods, such as ROUGE and BERTScore, which measure token similarity, often fail t

Externí odkaz: http://arxiv.org/abs/2401.02132

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání