Integrating Disparate Digital Libraries using the WASSIT Mediation Framework

Autor: Faouzia Wadjinny, Ahmed Moujane, Imane Zaoui, Dalila Chiadmi
Rok vydání: 2011
Předmět:
Zdroj: Digital Libraries-Methods and Applications
DOI: 10.5772/15823
Popis: Nowadays, there is a trend to integrate several digital libraries (DLs) to offer richer information. However, the following three characteristics of DLs make their integration a difficult task (Hasselbring, 2000): (i) Distribution: geographical spread; (ii) Heterogeneity: difference at both the technical level (e.g., hardware platform, operating system, etc.) and conceptual level (e.g., data model, query language, etc.); (iii) Autonomy: DLs are selfsufficient, as opposed to being delegated a role only as components in a larger system. Therefore, challenges faced when integrating DLs include interoperability (among different DLs) and resource discovery (selection of the best sites to be integrated). There are two different types of interoperability for DLs integration (Shen, 2006): syntactic interoperability and semantic interoperability. Syntactic interoperability is the application-level interoperability that allows multiple software components to cooperate even though their data model, query language, interfaces, etc. are different. Semantic interoperability is the knowledge-level interoperability that allows digital libraries to be integrated, with the ability to bridge semantic conflicts arising from differences in implicit meanings, perspectives and assumptions, thus creating a semantically compatible information environment based on agreed-upon concepts. To deal with the interoperability problem, two solutions can be used: warehousing and mediation systems. In the warehouse approach (Rundensteiner and al., 2000), information is in some way periodically extracted from different sources, processed, merged with information from other sources, and then loaded into a centralized data store. Queries are posed against the local data without further interaction with the original sources. Modifications are filtered (e.g. for relevance or update-time) and propagated in some manner to upgrade the data warehouse. The main advantage of the warehousing approach is the performance of query processing. The main drawbacks are that the data may not be fresh and adding new data source requires reconsidering the warehouse schema. Thus, concerns about data quality and consistency must be addressed. In mediation systems (Wiederhold, 1992), data remains at the sources and queries to the integrated system need to be translated, at run time, into a sequence of sub-queries to the underlying data sources. Data is not replicated and is guaranteed to be fresh at query time. However, a considerable performance penalty must be paid because sources are contacted for every query. Besides, in heterogeneous environments, especially in the context of DLs
Databáze: OpenAIRE