Autor: |
June-Young Jung, Moon-Soo Chang |
Rok vydání: |
2007 |
Předmět: |
|
Zdroj: |
Journal of Korean Institute of Intelligent Systems. 17:849-854 |
ISSN: |
1976-9172 |
DOI: |
10.5391/jkiis.2007.17.6.849 |
Popis: |
It is difficult that we collect only target documents from the Innumerable Web documents. One of solution to the problem is that we select target documents on the Web site which services many documents of target domain. In this paper, we will propose an intelligent crawling method collecting needed documents based on URL pattern script defined by XML. Proposed crawling method will efficiently apply to the sites which service structuralized information of a piece with database. In this paper, we collected 50 thousand Web documents using our crawling method. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|