Introducing shadows: Flexible document representation and annotation on the Web
Autor: | Matheus Silva Mota, Claudia Bauzer Medeiros |
---|---|
Rok vydání: | 2013 |
Předmět: |
Structure (mathematical logic)
Information retrieval Computer science business.industry Interoperability Well-formed document Document management system Document clustering computer.software_genre World Wide Web Annotation ComputingMethodologies_DOCUMENTANDTEXTPROCESSING Document engineering business computer Content management |
Zdroj: | ICDE Workshops |
DOI: | 10.1109/icdew.2013.6547416 |
Popis: | The Web is witnessing an exponential growth of increasingly complex, distributed and heterogeneous documents. This hampers document exchange, as well as their annotation and retrieval. While information retrieval mechanisms concentrate on textual features (corpus analysis), annotation approaches either target specific formats or require that a document follows interoperable standards. This work presents our effort to handle these problems, providing a more flexible solution. Rather than trying to modify or convert the document itself, or to target only textual characteristics, the strategy described in this work is based on an intermediate descriptor - the document shadow. A shadow represents domain-relevant aspects and elements of both structure and content of a given document, as defined by a user group. Rather than annotating documents themselves, it is the shadows that are annotated, thereby providing independence between annotations and document formats. Our annotations take advantage of the LOD initiative. Via annotations users can derive correlations across shadows, in a flexible way. Moreover, shadows and annotations are stored in databases, therefore allowing uniform database treatments of heterogeneous documents. |
Databáze: | OpenAIRE |
Externí odkaz: |