Finding relevant biomedical datasets: the UC San Diego solution for the bioCADDIE Retrieval Challenge
Autor: | Zhanglong Ji, Lucila Ohno-Machado, Yupeng He, Kai Zhang, Qi Li, Wei Wei, Yuanchi Ha |
---|---|
Rok vydání: | 2018 |
Předmět: |
0301 basic medicine
Metadata Biomedical Research Information retrieval Databases Factual business.industry Computer science 05 social sciences California General Biochemistry Genetics and Molecular Biology 03 medical and health sciences 030104 developmental biology Text mining Data Mining Original Article 0509 other social sciences 050904 information & library sciences General Agricultural and Biological Sciences business Information Systems |
Zdroj: | Database: The Journal of Biological Databases and Curation |
ISSN: | 1758-0463 |
Popis: | The number and diversity of biomedical datasets grew rapidly in the last decade. A large number of datasets are stored in various repositories, with different formats. Existing dataset retrieval systems lack the capability of cross-repository search. As a result, users spend time searching datasets in known repositories, and they typically do not find new repositories. The biomedical and healthcare data discovery index ecosystem (bioCADDIE) team organized a challenge to solicit new indexing and searching strategies for retrieving biomedical datasets across repositories. We describe the work of one team that built a retrieval pipeline and examined its performance. The pipeline used online resources to supplement dataset metadata, automatically generated queries from users’ free-text questions, produced high-quality retrieval results and achieved the highest inferred Normalized Discounted Cumulative Gain among competitors. The results showed that it is a promising solution for cross-database, cross-domain and cross-repository biomedical dataset retrieval. Database URL: https://github.com/w2wei/dataset_retrieval_pipeline |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |