Popis: |
This thesis focuses on Eigenvector Spatial Filtering (ESF), developed by Griffith (2000, 2003) as a methodology designed to handle general cross-sectional/spatial dependence. ESF uses a subset of eigenvectors from a spatial weights matrix in a linear regression framework to approximate/control for any spatially correlated terms in the underlying data-generating process. Thus, ESF has the key advantage that it does not require the researcher to specify which parts of the model are spatially correlated. This advantage is the main driver behind ESF’s recent increasing popularity with applied economists. I extend the theory around ESF and its Lasso (Tibshirani, 1996) implementation for the cases when the structural equation being studied includes and excludes endogenous variables, and demonstrate how the method can be applied in several empirical applications. This thesis is comprised of four chapters, the first is a literature review, the second and third are in econometrics, and the fourth is in environmental economics. The econometrics chapters propose Moran’s i based Lasso procedures for estimating exogenous and endogenous right-hand-side regression parameters when the data is spatially dependent. The last chapter of this thesis uses the procedure proposed in the second chapter to account for spatial dependence when testing for the presence of the environmental Kuznets curve for forests. The first chapter provides a review of some common spatial economic models and how they are conventionally estimated, an overview of Kojevnikov et al. (2021) limit theorems for cross-sectionally dependent random variables (used in Chapter 3), and a summary of penalised regressions with a focus on Lasso. The second chapter, entitled “Moran’s i Lasso: for spatially correlated models” provides a theoretical contribution. After an extensive evaluation of existing procedures to select the relevant subset of eigenvectors for ESF, I develop a new selection method called Moran’s i Lasso (Mi-Lasso). The procedure uses information about the overall level of spatial dependence present in the underlying data-generating process, contained in the Moran’s i, to determine a point estimate for the Lasso tuning parameter. I derive performance bounds and show the necessary conditions for consistent eigenvector selection. The key advantages of the proposed estimator are that it is intuitive and substantially faster than Lasso based on cross-validation or any proposed forward stepwise procedure. Our main simulation results show the proposed selection procedure performs well in finite samples and an application on house prices. Compared to existing selection procedures, I find, Mi-Lasso has one of the smallest biases and mean squared errors across a range of sample sizes and levels of spatial correlation. Additionally, through an evaluation of the properties of the spectral decomposition, I note that ESF can also handle higher-order spatial lags, which is confirmed in a simulation experiment. The third chapter, entitled “Moran’s i 2-Stage Lasso: for spatial models with endogenous variables” also provides a theoretical contribution, is co-authored work with Dr. Sylvain Barde and Dr. Guy Tchuente. It proposes a new way of estimating a spatial model that includes endogenous variables when the researchers’ main concerns are estimating only the direct effect and/or misspecification of the spatial weights matrix and spatial model. The proposed procedure uses Mi-Lasso to select the first and second-stage relevant eigenvectors and then uses the union of selected eigenvectors as controls in a two-stage least squares regression. The procedure is called Moran’s i 2-Stage Lasso (Mi-2SL). We show the conditions necessary for consistent and asymptotically normal parameter estimation assuming the support (relevant) set of eigenvectors is known. Our Monte Carlo simulation results also show that Mi-2SL performs well when the spatial weights matrix has a high degree of misspecified links. Our empirical application replicates Cadena and Kovak (2016) instrumental variables estimates using Mi-2SL and shows that Mi-2SL can boost the performance of the first stage. Finally, the fourth chapter, entitled “The Environmental Kuznets Curve for forests: an application of Mi-Lasso” is an application of Mi-Lasso to a hotly debated question in environmental economics. Does the relationship between a country’s economic development (proxied by per capita GDP) and its deforestation rate follow the inverse U-shaped curve postulated by the classic environmental Kuznets curve for forests? I use the Mi-Lasso methodology proposed in chapter 2 to account for spatial dependence of an unknown functional form when testing for the presence of the environmental Kuznets curve for forests. I find evidence of a non-linear relationship, which in some cases is a more complicated predicted inverse U-shaped curve, the average peak rate of deforestation appears to be falling with time, while the income required for the deforestation rate to start falling is increasing with time. Additionally, I find, that if the spatial dependence is not accounted for, the OLS estimates of income exhibit an absolute upward bias. |