An evaluation model for systems and resources employed in the correction of errors in textual documents
Autor: | Arnaud Renard, Béatrice Rumpler, Sylvie Calabretto |
---|---|
Přispěvatelé: | Distribution, Recherche d'Information et Mobilité (DRIM), Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS), Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-École Centrale de Lyon (ECL), Université de Lyon-Université Lumière - Lyon 2 (UL2)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Université Lumière - Lyon 2 (UL2), Franck Morvan, A Min Tjoa, Roland R. Wagner |
Jazyk: | angličtina |
Rok vydání: | 2011 |
Předmět: |
Information retrieval
business.industry Computer science media_common.quotation_subject Context (language use) 02 engineering and technology computer.software_genre Variable (computer science) Text mining 020204 information systems 0202 electrical engineering electronic engineering information engineering Benchmark (computing) 020201 artificial intelligence & image processing Quality (business) The Internet [INFO]Computer Science [cs] Data mining String metric Web service business computer media_common |
Zdroj: | 8th International Workshop on Text-based Information Retrieval (TIR 2011) in conjunction with the 22nd International Conference on Database and Expert Systems Applications (DEXA 2011) 8th International Workshop on Text-based Information Retrieval (TIR 2011) in conjunction with the 22nd International Conference on Database and Expert Systems Applications (DEXA 2011), Aug 2011, Toulouse, France, France. pp.160-164, ⟨10.1109/DEXA.2011.11⟩ DEXA Workshops |
DOI: | 10.1109/DEXA.2011.11⟩ |
Popis: | International audience; The wide adoption of Web 2.0 services has resulted in an increase in the amount of information produced. The quantity of errors contained in such information has grown even faster. Indeed, in traditional information production process documents were produced by professionals while in the Web context the content is generated by the users themselves. It is therefore necessary to take into account the errors particularly when such systems need to manage information of variable quality. Our state of the art leads us to identify difficulties in the comparative evaluation of error correction systems. Our proposal consists in an evaluation model for error correction systems and low-level string similarity (and distance) metrics they rely on. This model is implemented in an extensible platform providing a framework to evaluate those systems. |
Databáze: | OpenAIRE |
Externí odkaz: |