Výsledky vyhledávání - "Pilan, Ildiko"

Report

Neural Text Sanitization with Privacy Risk Indicators: An Empirical Analysis

Autor: Papadopoulou, Anthi, Lison, Pierre, Anderson, Mark, Øvrelid, Lilja, Pilán, Ildikó

Text sanitization is the task of redacting a document to mask all occurrences of (direct or indirect) personal identifiers, with the goal of concealing the identity of the individual(s) referred in it. In this paper, we consider a two-step approach t

Externí odkaz: http://arxiv.org/abs/2310.14312

Zobrazit plný text záznamu

Report

Conversational Feedback in Scripted versus Spontaneous Dialogues: A Comparative Analysis

Autor: Pilán, Ildikó, Prévot, Laurent, Buschmeier, Hendrik, Lison, Pierre

Publikováno v: In Proceedings of SIGdial 2024, pp. 440-457. Kyoto, Japan (2024)

Scripted dialogues such as movie and TV subtitles constitute a widespread source of training data for conversational NLP models. However, there are notable linguistic differences between these dialogues and spontaneous interactions, especially regard

Externí odkaz: http://arxiv.org/abs/2309.15656

Zobrazit plný text záznamu

Report

Bootstrapping Text Anonymization Models with Distant Supervision

Autor: Papadopoulou, Anthi, Lison, Pierre, Øvrelid, Lilja, Pilán, Ildikó

We propose a novel method to bootstrap text anonymization models based on distant supervision. Instead of requiring manually labeled training data, the approach relies on a knowledge graph expressing the background information assumed to be publicly

Externí odkaz: http://arxiv.org/abs/2205.06895

Zobrazit plný text záznamu

Report

The Text Anonymization Benchmark (TAB): A Dedicated Corpus and Evaluation Framework for Text Anonymization

Autor: Pilán, Ildikó, Lison, Pierre, Øvrelid, Lilja, Papadopoulou, Anthi, Sánchez, David, Batet, Montserrat

We present a novel benchmark and associated evaluation metrics for assessing the performance of text anonymization methods. Text anonymization, defined as the task of editing a text document to prevent the disclosure of personal information, currentl

Externí odkaz: http://arxiv.org/abs/2202.00443

Zobrazit plný text záznamu

Report

Building a Norwegian Lexical Resource for Medical Entity Recognition

Autor: Pilán, Ildikó, Brekke, Pål H., Øvrelid, Lilja

We present a large Norwegian lexical resource of categorized medical terms. The resource merges information from large medical databases, and contains over 77,000 unique entries, including automatically mapped terms from a Norwegian medical dictionar

Externí odkaz: http://arxiv.org/abs/2004.02509

Zobrazit plný text záznamu

Report

Candidate sentence selection for language learning exercises: from a comprehensive framework to an empirical evaluation

Autor: Pilán, Ildikó, Volodina, Elena, Borin, Lars

We present a framework and its implementation relying on Natural Language Processing methods, which aims at the identification of exercise item candidates from corpora. The hybrid system combining heuristics and machine learning methods includes a nu

Externí odkaz: http://arxiv.org/abs/1706.03530

Zobrazit plný text záznamu

Report

Detecting Context Dependence in Exercise Item Candidates Selected from Corpora

Autor: Pilán, Ildikó

We explore the factors influencing the dependence of single sentences on their larger textual context in order to automatically identify candidate sentences for language learning exercises from corpora which are presentable in isolation. An in-depth

Externí odkaz: http://arxiv.org/abs/1605.01845

Zobrazit plný text záznamu

Report

SweLL on the rise: Swedish Learner Language corpus for European Reference Level studies

Autor: Volodina, Elena, Pilán, Ildikó, Enström, Ingegerd, Llozhi, Lorena, Lundkvist, Peter, Sundberg, Gunlög, Sandell, Monica

We present a new resource for Swedish, SweLL, a corpus of Swedish Learner essays linked to learners' performance according to the Common European Framework of Reference (CEFR). SweLL consists of three subcorpora - SpIn, SW1203 and Tisus, collected fr

Externí odkaz: http://arxiv.org/abs/1604.06583

Zobrazit plný text záznamu

Report

A Readable Read: Automatic Assessment of Language Learning Materials based on Linguistic Complexity

Autor: Pilán, Ildikó, Vajjala, Sowmya, Volodina, Elena

Corpora and web texts can become a rich language learning resource if we have a means of assessing whether they are linguistically appropriate for learners at a given proficiency level. In this paper, we aim at addressing this issue by presenting the

Externí odkaz: http://arxiv.org/abs/1603.08868

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání