A simple measure of conditional dependence

Autor: Azadkia, Mona, Chatterjee, Sourav
Rok vydání: 2019
Předmět:
Druh dokumentu: Working Paper
Popis: We propose a coefficient of conditional dependence between two random variables $Y$ and $Z$ given a set of other variables $X_1,\ldots,X_p$, based on an i.i.d. sample. The coefficient has a long list of desirable properties, the most important of which is that under absolutely no distributional assumptions, it converges to a limit in $[0,1]$, where the limit is $0$ if and only if $Y$ and $Z$ are conditionally independent given $X_1,\ldots,X_p$, and is $1$ if and only if $Y$ is equal to a measurable function of $Z$ given $X_1,\ldots,X_p$. Moreover, it has a natural interpretation as a nonlinear generalization of the familiar partial $R^2$ statistic for measuring conditional dependence by regression. Using this statistic, we devise a new variable selection algorithm, called Feature Ordering by Conditional Independence (FOCI), which is model-free, has no tuning parameters, and is provably consistent under sparsity assumptions. A number of applications to synthetic and real datasets are worked out.
Comment: 41 pages, 2 tables. Final version. To appear in Ann. Statist. An R package is available at https://CRAN.R-project.org/package=FOCI
Databáze: arXiv