Applying random forest in a health administrative data context: a conceptual guide
Autor: | Caroline King, Erin Strumpf |
---|---|
Rok vydání: | 2021 |
Předmět: |
0303 health sciences
Data context and interaction medicine.medical_specialty Computer science Health Policy Public health Public Health Environmental and Occupational Health Health services research 01 natural sciences Data science Toolbox 3. Good health Random forest Health administration 010104 statistics & probability 03 medical and health sciences Community health medicine 0101 mathematics Strengths and weaknesses 030304 developmental biology |
Zdroj: | Health Services and Outcomes Research Methodology. 22:96-117 |
ISSN: | 1572-9400 1387-3741 |
DOI: | 10.1007/s10742-021-00255-7 |
Popis: | To introduce Random Forest (RF), a machine learning method, in an accessible way for health services researchers and highlight its unique considerations when applied to health administrative data. Physician claims’ data from the universal public insurer linked with the Canadian Community Health Survey for the Canadian province of Quebec. We describe in detail how RF can be useful in health services research, provide guidance on data set up, modeling decisions and demonstrate how to interpret results. We also highlight specific considerations for applying RF to health administrative data. In a working example, we compare RF with logistic regression, Ridge regression and LASSO in their ability to predict whether a person has a regular medical doctor. We use survey responses to “do you have a regular medical doctor” from three cycles of the Canadian Community Health Survey (2007, 2009, 2011). Responses are linked with physician claims’ data from 2002 to 2012. We limit our cohort to persons 40 years and older at the time of responding to the survey. We discuss the strengths and weaknesses of using RF in a health services research setting in comparison to using more conventional modeling techniques. Applying a RF model in a health services research setting can have advantages over conventional modeling approaches and we encourage health services researchers to add RF to their toolbox of predictive modeling methods. |
Databáze: | OpenAIRE |
Externí odkaz: |