Popis: |
Observational longitudinal studies are a common means to study treatment efficacy and safety in chronic mental illness. In many such studies, treatment changes may be initiated by either the patient or by their clinician and can thus vary widely across patients in their timing, number, and type. Indeed, in the observational longitudinal pathway of the STEP-BD study of bipolar depression, one of the motivations for this work, no two patients have the same treatment history even after coarsening clinic visits to a weekly time-scale. Estimation of an optimal treatment regime using such data is challenging as one cannot naively pool together patients with the same treatment history, as is required by methods based on inverse probability weighting, nor is it possible to apply backwards induction over the decision points, as is done in Q-learning and its variants. Thus, additional structure is needed to effectively pool information across patients and within a patient over time. Current scientific theory for many chronic mental illnesses maintains that a patient's disease status can be conceptualized as transitioning among a small number of discrete states. We use this theory to inform the construction of a partially observable Markov decision process model of patient health trajectories wherein observed health outcomes are dictated by a patient's latent health state. Using this model, we derive and evaluate estimators of an optimal treatment regime under two common paradigms for quantifying long-term patient health. The finite sample performance of the proposed estimator is demonstrated through a series of simulation experiments and application to the observational pathway of the STEP-BD study. We find that the proposed method provides high-quality estimates of an optimal treatment strategy in settings where existing approaches cannot be applied without ad hoc modifications. |