Popis: |
Information overload is a well-known problem facing biomedical professionals. MEDLINE, the biomedical bibliographic database, adds hundreds of articles daily to the millions already in its collection. This overload is exacerbated by the lack of relevance-based ranking for search results, as well as disparate levels of search skill and domain experience of professionals using systems designed to search MEDLINE. We propose to address these problems through learning ranking functions from user relevance feedback. Simple active learning techniques can be used to learn ranking functions using a fraction of the available data, with performance approaching that of functions learned using all available data. Furthermore, ranking functions learned using metadata features from the Medical Subject Heading (MeSH) terms associated with MEDLINE citations greatly outperform functions learned using textual features. An in-depth investigation is made into the effect of a number of variables in the ranking round, while further investigation is made into peripheral issues such as users providing inconsistent data. |