Clinic expert information extraction based on domain model and block importance model
Autor: | Danmin Qian, Xingyun Geng, Dengfu Yao, Jiancheng Dong, Yuanpeng Zhang, Li Wang |
---|---|
Rok vydání: | 2015 |
Předmět: |
Internet
Support Vector Machine Web search query Information retrieval Computer science computer.internet_protocol Information Storage and Retrieval Reproducibility of Results Health Informatics Domain model computer.software_genre Hospitals Computer Science Applications Domain (software engineering) Information extraction Web query classification Web page Data Mining Computer Simulation Data mining computer Algorithms Medical Informatics Block (data storage) XPath |
Zdroj: | Computers in Biology and Medicine. 66:337-342 |
ISSN: | 0010-4825 |
DOI: | 10.1016/j.compbiomed.2015.07.009 |
Popis: | To extract expert clinic information from the Deep Web, there are two challenges to face. The first one is to make a judgment on forms. A novel method based on a domain model, which is a tree structure constructed by the attributes of query interfaces is proposed. With this model, query interfaces can be classified to a domain and filled in with domain keywords. Another challenge is to extract information from response Web pages indexed by query interfaces. To filter the noisy information on a Web page, a block importance model is proposed, both content and spatial features are taken into account in this model. The experimental results indicate that the domain model yields a precision 4.89% higher than that of the rule-based method, whereas the block importance model yields an F1 measure 10.5% higher than that of the XPath method. Clinic expert information provides references for residents who need hospital care.A domain model was defined to identify Web Query Interfaces.A virtual cluster was established for improving the performance of the domain model.A block importance model was proposed to filter noisy information in a Web page. |
Databáze: | OpenAIRE |
Externí odkaz: |