Popis: |
AIMS: Population-based health administrative data (HAD) is becoming increasingly popular to conduct observational studies in patients with inflammatory bowel disease (IBD). The inability to ascertain IBD phenotypes using HAD is a major limitation of using this data for research. We evaluated whether combinations of health care utilization patterns ascertained from HAD in Ontario, Canada could be used to predict disease activity and extent at diagnosis in ulcerative colitis (UC) patients. This would significantly improve the quality of future HAD-based research in UC patients. METHODS: Consecutive patients who were diagnosed with UC at The Ottawa Hospital (TOH) between October 1, 2001 and March 31, 2012 were identified from a large hospital database and characterized through chart review on endoscopic disease extent and activity at initial diagnosis. Colitis extent was classified as proctitis, left-sided colitis or extensive colitis, based on the Montreal classification, and colitis activity was classified as normal, mild, moderate or severe, using the Mayo endoscopic classification. These patients were linked to Ontario HAD using unique identifiers. Health care utilization patterns and outcomes that were suspected to be associated with disease extent and/or activity were ascertained from Ontario HAD for each patient. Variables were individually tested for their association with disease phenotypes over 1, 2 and 3 years following UC diagnosis to determine the optimal exposure timeframe. Multivariable logistic regression was used to model 6 unique phenotypic classification schemes. Backwards elimination was used to produce parsimonious models and bootstrap validation was performed to produce accurate estimates of model performance statistics. RESULTS: 587 UC patients characterized on disease extent and activity at diagnosis were linked to Ontario HAD to validate the predictive models. Health care utilization and outcome parameters performed best when ascertained over 1 year following diagnosis. Multicollinearity was not observed for any of the 20 independent variables tested in the regression models. Following variable selection and bootstrapping, the final models modestly predicted disease phenotypes, with c-statistic values ranging between 0.663 and 0.729. Interaction terms and transformations of variables did not impact model fit or predictive power. CONCLUSIONS: Health care utilization patterns based on Ontario HAD are moderately effective at predicting disease behaviour at diagnosis in ulcerative colitis patients. External validation of the models will be conducted in future work. Similar strategies to improve the quality of studies in IBD should be adopted in other jurisdictions that use HAD. FUNDING AGENCIES: None |