THE DATA EXTRACTION USING DISTRIBUTED CRAWLER INSIDE THE MULTI-AGENT SYSTEM.

Autor: TOMALA, Karel, PLUCAR, Jan, DUBEC, Patrik, RAPANT, Lukas, VOZNAK, Miroslav
Předmět:
Zdroj: Advances in Electrical & Electronic Engineering; Dec2013, Vol. 11 Issue 6, p455-460, 6p
Abstrakt: The paper discusses the use of web crawler technology. We created an application based on standard web crawler. Our application is determined for data extraction. Primarily, the application was designed to extract data using keywords from a social network Twitter. First, we created a standard crawler, which went through a predefined list of URLs and gradually download page content of each of the URLs. Page content was then parsed and important text and metadata were stored in a database. Recently, the application was modified in to the form of the multi-agent system. The system was developed in the C# language, which is used to create web applications and sites etc. Obtained data was evaluated graphically. The system was created within Indect project at the VSB-Technical University of Ostrava. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index