Modelling of Cancer Patient Records: A Structured Approach to Data Mining and Visual Analytics
Autor: | Alan Hales, Jing Lu, David A. Rew |
---|---|
Rok vydání: | 2017 |
Předmět: |
Decision support system
Visual analytics 020205 medical informatics Restructuring Computer science business.industry 02 engineering and technology computer.software_genre Data science Health informatics Data warehouse Software Analytics 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Data mining business computer |
Zdroj: | Information Technology in Bio-and Medical Informatics ISBN: 9783319642642 ITBAM |
DOI: | 10.1007/978-3-319-64265-9_4 |
Popis: | This research presents a methodology for health data analytics through a case study for modelling cancer patient records. Timeline-structured clinical data systems represent a new approach to the understanding of the relationship between clinical activity, disease pathologies and health outcomes. The novel Southampton Breast Cancer Data System contains episode and timeline-structured records on >17,000 patients who have been treated in University Hospital Southampton and affiliated hospitals since the late 1970s. The system is under continuous development and validation. Modern data mining software and visual analytics tools permit new insights into temporally-structured clinical data. The challenges and outcomes of the application of such software-based systems to this complex data environment are reported here. The core data was anonymised and put through a series of pre-processing exercises to identify and exclude anomalous and erroneous data, before restructuring within a remote data warehouse. A range of approaches was tested on the resulting dataset including multi-dimensional modelling, sequential patterns mining and classification. Visual analytics software has enabled the comparison of survival times and surgical treatments. The systems tested proved to be powerful in identifying episode sequencing patterns which were consistent with real-world clinical outcomes. It is concluded that, subject to further refinement and selection, modern data mining techniques can be applied to large and heterogeneous clinical datasets to inform decision making. |
Databáze: | OpenAIRE |
Externí odkaz: |