Výsledky vyhledávání

Report

Mitigating Text Toxicity with Counterfactual Generation

Autor: Bhan, Milan, Vittaut, Jean-Noel, Achache, Nina, Legrand, Victor, Chesneau, Nicolas, Blangero, Annabelle, Murris, Juliette, Lesot, Marie-Jeanne

Toxicity mitigation consists in rephrasing text in order to remove offensive or harmful meaning. Neural natural language processing (NLP) models have been widely used to target and mitigate textual toxicity. However, existing methods fail to detoxify

Externí odkaz: http://arxiv.org/abs/2405.09948

Zobrazit plný text záznamu

Report

Self-AMPLIFY: Improving Small Language Models with Self Post Hoc Explanations

Autor: Bhan, Milan, Vittaut, Jean-Noel, Chesneau, Nicolas, Lesot, Marie-Jeanne

Incorporating natural language rationales in the prompt and In-Context Learning (ICL) have led to a significant improvement of Large Language Models (LLMs) performance. However, generating high-quality rationales require human-annotation or the use o

Externí odkaz: http://arxiv.org/abs/2402.12038

Zobrazit plný text záznamu

Report

TIGTEC : Token Importance Guided TExt Counterfactuals

Autor: Bhan, Milan, Vittaut, Jean-Noel, Chesneau, Nicolas, Lesot, Marie-Jeanne

Counterfactual examples explain a prediction by highlighting changes of instance that flip the outcome of a classifier. This paper proposes TIGTEC, an efficient and modular method for generating sparse, plausible and diverse counterfactual explanatio

Externí odkaz: http://arxiv.org/abs/2304.12425

Zobrazit plný text záznamu

Report

Evaluating self-attention interpretability through human-grounded experimental protocol

Autor: Bhan, Milan, Achache, Nina, Legrand, Victor, Blangero, Annabelle, Chesneau, Nicolas

Attention mechanisms have played a crucial role in the development of complex architectures such as Transformers in natural language processing. However, Transformers remain hard to interpret and are considered as black-boxes. This paper aims to asse

Externí odkaz: http://arxiv.org/abs/2303.15190

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání