A collaborative approach to building evaluated web pages datasets

Autor: Heraldo J. A. Carneiro Filho, Geraldo Xexéo, Ricardo Barros, Oliverio C. Fernandes, Andre L. G. Ribeiro, Fabricio R. S. Ferreira, Carlos Eduardo Paulino Silva, Jose A. Rodrigues Nt, Jano Moreira de Souza
Rok vydání: 2009
Předmět:
Zdroj: CSCWD
DOI: 10.1109/cscwd.2009.4968135
Popis: In order to evaluate information retrieval algorithms it is imperative to use a dataset as a test database. However, access to such datasets is often difficult and expensive, since building them is a time-consuming and costly task. This paper presents a collaborative approach to dataset creation that uses a data quality evaluation technique based on fuzzy theory, to assist users in selecting suitable web documents for their datasets. These documents are automatically captured by a crawler and assessed on information derived from their metadata.
Databáze: OpenAIRE