Automatic Workflow Base on Web Knowledge Extraction - A Case Study on Recipe Website

Autor: Chen, Chunghung, 陳昶宏
Rok vydání: 2011
Druh dokumentu: 學位論文 ; thesis
Popis: 99
This thesis focuses on the recipe application domain, in which the proposed system collects recipe-related pages from websites, extracts domain metadata from pages, mines relations among recipe data objects, and integrates those information as the knowledge for recipe domain. First, by searching the Web and analyzing pages and sites, the system accumulates numerous recipe data. Then, major content blocks of recipe pages are extracted by analyzing the HTML DOM tree structures for pages and sites. By building indices for texts extracted from those pages, phrase extraction and paring methods are employed to identify significant recipe keywords (key phrases) as domain concepts. Mining associations between recipe-domain concepts is applied to explore concept relations to build the domain ontology. Recipe concepts and relations are presented as the Knowledge Map using SVG (Scalable Vector Graphics). Finally, the recipe domain knowledge is applied to provide services for mobile applications as the demonstration.
Databáze: Networked Digital Library of Theses & Dissertations