Public Data Integration with WebSmatch

Autor: Coletta, R., Castanier, E., Valduriez, P., Frisch, C., Ngo, D., Bellahsene, Z.
Rok vydání: 2012
Předmět:
Druh dokumentu: Working Paper
Popis: Integrating open data sources can yield high value information but raises major problems in terms of metadata extraction, data source integration and visualization of integrated data. In this paper, we describe WebSmatch, a flexible environment for Web data integration, based on a real, end-to-end data integration scenario over public data from Data Publica. WebSmatch supports the full process of importing, refining and integrating data sources and uses third party tools for high quality visualization. We use a typical scenario of public data integration which involves problems not solved by currents tools: poorly structured input data sources (XLS files) and rich visualization of integrated data.
Comment: Presented at the First International Workshop On Open Data, WOD-2012 (http://arxiv.org/abs/1204.3726)
Databáze: arXiv