Popis: |
Physicians often seek scientific evidence regarding how to best care for their patients, while making clinical decisions These scientific evidence are available in the form of published medical articles, reports and clinical trials. Considering the volume of already existing medical literature and the pace at which medical research is growing, getting the most relevant information will be a tedious task. In this paper, we describe an empirical approach to fetch relevant medical articles from the PubMed (about 733,328 articles of 45.2 gigabytes) collection, based on a given query. Our IR system comprises of three parts: inverted indexing using Lucene, lexical query expansion to increase recall with MetaMap and reranking aimed at optimizing the system. Word sense of ambiguous terms are introduced to limit the negative effects that synonymy-based query expansion may have on precision. The subsequent ranked list was then re-ranked with learning to rank algorithms. We evaluated our system using 30 medical queries and the results show that our system can handle various medical queries effectively and efficiently. Also, the final results demonstrate that the ensemble approach performs better than the Lucene baseline by boosting the ranking of articles that are near the top of several single ranked lists. |