Unstable markup: A template-based information extraction from web sites with unstable markup

Autor: Kolchin, Maxim, Kozlov, Fedor
Rok vydání: 2014
Předmět:
Druh dokumentu: Working Paper
Popis: This paper presents results of a work on crawling CEUR Workshop proceedings web site to a Linked Open Data (LOD) dataset in the framework of ESWC 2014 Semantic Publishing Challenge 2014. Our approach is based on using an extensible template-dependent crawler and DBpedia for linking extracted entities, such as the names of universities and countries.
Comment: ESWC 2014 Semantic Publishing Challenge, Task 1
Databáze: arXiv