The Effective Use of a Summary Table and Decision Tree Methodology to Analyze Very Large Healthcare Datasets

Autor: David Sibbritt, Robert W. Gibberd
Rok vydání: 2004
Předmět:
Zdroj: Health Care Management Science. 7:163-171
ISSN: 1386-9620
DOI: 10.1023/b:hcms.0000039379.32963.9e
Popis: Very large datasets typically consists of millions of records, with many variables. Such datasets are stored and maintained by organizations because of the perceived potential information they contain. However, the problem with very large datasets is that traditional methods of data mining are not capable of retrieving this information because the software may be overwhelmed by the memory or computing requirements. In this article we outline a method that can analyze very large datasets. The method initially performs a data reduction step through the use of a summary table, which is then used as a reference dataset that is recursively partitioned to grow a decision tree.
Databáze: OpenAIRE