Query Interface Schema Extraction for Hidden Web Resources Searching

Autor: Authors Zhang Huan, Yang Panfei, Yu Zitong
Rok vydání: 2020
Předmět:
Zdroj: 2020 7th International Conference on Information Science and Control Engineering (ICISCE).
Popis: It is an urgent task of the Web search field to satisfy people’s demand for having effective access to the high-quality Web. Instead of specifying a URL to send an HTTP request to get the static page information, accessing hidden Web resources (deep Web) need to post queries to the query interface provided by the website. The query interface is the entrance to get the Web database information. Therefore, the research on schema extraction from deep Web query interface is a key step in hidden Web resources mining. This paper presents a novel approach to extract interface schema from deep Web based on domain ontology. Besides, it also proposes a new presentation of query interface attribute, which reflects the semantic relationships between the labels, on the basis of location, label semantic relationship and string similarity. Experimental results show that our system is feasible and efficient, and achieves high precision, recall and F-Measure value across a variety of databases.
Databáze: OpenAIRE