mRSC: Multi-dimensional Robust Synthetic Control
Autor: | Muhammad Amjad, Vishal Misra, Devavrat Shah, Dennis Shen |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
FOS: Computer and information sciences
Computer Networks and Communications Econometrics (econ.EM) 020206 networking & telecommunications 02 engineering and technology 01 natural sciences Methodology (stat.ME) FOS: Economics and business 010104 statistics & probability Hardware and Architecture 0202 electrical engineering electronic engineering information engineering Computer Science (miscellaneous) 0101 mathematics Safety Risk Reliability and Quality Software Statistics - Methodology Economics - Econometrics |
Zdroj: | arXiv |
Popis: | When evaluating the impact of a policy (e.g., gun control) on a metric of interest (e.g., crime-rate), it may not be possible or feasible to conduct a randomized control trial. In such settings where only observational data is available, synthetic control (SC) methods [2-4] provide a popular data-driven approach to estimate a "synthetic" or "virtual" control by combining measurements of "similar" alternatives or units (called "donors"). Recently, robust synthetic control (RSC) [7] was proposed as a generalization of SC to overcome the challenges of missing data and high levels of noise, while removing the reliance on expert domain knowledge for selecting donors. However, both SC and RSC (and its variants) suffer from poor estimation when the pre-intervention period is too short. As the main contribution of this work, we propose a generalization of unidimensional RSC to multi-dimensional Robust Synthetic Control, mRSC. Our proposed mechanism, mRSC, incorporates multiple types of measurements (or metrics) in addition to the measurement of interest for estimating a synthetic control, thus overcoming the challenge of poor inference due to limited amounts of pre-intervention data. We show that the mRSC algorithm, when using K relevant metrics, leads to a consistent estimator of the synthetic control for the target unit of interest under any metric. Our finite-sample analysis suggests that the mean-squared error (MSE) of our predictions decays to zero at a rate faster than the RSC algorithm by a factor of K and √K for the training (pre-intervention) and testing (post-intervention) periods, respectively. Additionally, we propose a principled scheme to combine multiple metrics of interest via a diagnostic test that evaluates if adding a metric can be expected to result in improved inference. Our mechanism for validating mRSC performance is also an important and related contribution of this work: time series prediction. We propose a method to predict the future evolution of a time series based on limited data when the notion of time is relative and not absolute, i.e., where we have access to a donor pool that has already undergone the desired future evolution. We conduct extensive experimentation to establish the efficacy of mRSC in three different scenarios: predicting the evolution of a metric of interest using synthetically generated data from a known factor model, and forecasting weekly sales and score trajectories of a Walmart store and Cricket game, respectively. |
Databáze: | OpenAIRE |
Externí odkaz: |