The Effective Use of a Summary Table and Decision Tree Methodology to Analyze Very Large Healthcare Datasets
Autor: | David Sibbritt, Robert W. Gibberd |
---|---|
Rok vydání: | 2004 |
Předmět: |
Incremental decision tree
Computer science business.industry Decision Trees Decision tree Medicine (miscellaneous) computer.software_genre Effect Modifier Epidemiologic Health informatics Software Research Design Data Interpretation Statistical General Health Professions Health care Table (database) Health Services Research Data mining business computer Reference dataset Data reduction |
Zdroj: | Health Care Management Science. 7:163-171 |
ISSN: | 1386-9620 |
DOI: | 10.1023/b:hcms.0000039379.32963.9e |
Popis: | Very large datasets typically consists of millions of records, with many variables. Such datasets are stored and maintained by organizations because of the perceived potential information they contain. However, the problem with very large datasets is that traditional methods of data mining are not capable of retrieving this information because the software may be overwhelmed by the memory or computing requirements. In this article we outline a method that can analyze very large datasets. The method initially performs a data reduction step through the use of a summary table, which is then used as a reference dataset that is recursively partitioned to grow a decision tree. |
Databáze: | OpenAIRE |
Externí odkaz: |