Leaving No Stone Unturned: Using Machine Learning Based Approaches for Information Extraction from Full Texts of a Research Data Warehouse
Autor: | Johanna Fiebeck, Svetlana Gerbel, Hans Laser, Hinrich B. Winther |
---|---|
Rok vydání: | 2018 |
Předmět: |
020205 medical informatics
Computer science business.industry Medical findings Data management Unstructured data 02 engineering and technology computer.software_genre Machine learning Data warehouse 030218 nuclear medicine & medical imaging Metadata 03 medical and health sciences Information extraction 0302 clinical medicine 0202 electrical engineering electronic engineering information engineering Artificial intelligence business computer Classifier (UML) Data integration |
Zdroj: | Lecture Notes in Computer Science ISBN: 9783030060152 DILS |
DOI: | 10.1007/978-3-030-06016-9_5 |
Popis: | Data in healthcare and routine medical treatment is growing fast. Therefore and because of its variety, possible correlation within these are becoming even more complex. Popular tools for facilitating the daily routine for the clinical researchers are more often based on machine learning (ML) algorithms. Those tools might facilitate data management, data integration or even content classification. Besides commercial functionalities, there are many solutions which are developed by the user himself for his own, specific question of research or task. One of these tasks is described within this work: qualifying the Weber fracture, an ankle joint fracture, from radiological findings with the help of supervised machine learning algorithms. To do so, the findings were firstly processed with common natural language processing (NLP) methods. For the classifying part, we used the bags-of-words-approach to bring together the medical findings on the one hand, and the metadata of the findings on the other hand, and compared several common classifier to have the best results. In order to conduct this study, we used the data and the technology of the Enterprise Clinical Research Data Warehouse (ECRDW) from Hannover Medical School. This paper shows the implementation of machine learning and NLP techniques into the data warehouse integration process in order to provide consolidated, processed and qualified data to be queried for teaching and research purposes. |
Databáze: | OpenAIRE |
Externí odkaz: |