Popis: |
For patients with rare comorbidities, there are insufficient observations to accurately estimate the effectiveness of treatment. At the same time, all diagnosis, including rare diagnosis, are part of the International Classification of Disease (ICD). Grouping ICD into broader concepts (i.e., ontology adjustment) can not only increase accuracy of estimating antidepressant effectiveness for patients with rare conditions but also prevent overfitting in big data analysis. In this study, 3,678,082 depressed patients treated with antidepressants were obtained from OptumLabs® Data Warehouse (OLDW). For rare diagnoses, adjustments were made by using the likelihood ratio of the immediate broader concept in the ICD hierarchies. The accuracy of models in training (90%) and test (10%) sets was examined using the area under the receiver operating curves (AROC). The gap in training and test AROC shows how much random noise was modeled. If the gap is large, then the parameters of the model, including the reported effectiveness of the antidepressant for patients with rare conditions, are suspect. There was, on average, a 9.0% reduction in the AROC gap after using the ontological adjustment. Therefore, ontology adjustment can reduce model overfitting, leading to better parameter estimates from the training set. |