Zobrazeno 1 - 10
of 581
pro vyhledávání: '"Reiter, Jerome"'
Autor:
Guha, Sharmistha, Reiter, Jerome P.
In the social and health sciences, researchers often make causal inferences using sensitive variables. These researchers, as well as the data holders themselves, may be ethically and perhaps legally obligated to protect the confidentiality of study p
Externí odkaz:
http://arxiv.org/abs/2408.14766
Autor:
Binette, Olivier, Reiter, Jerome P.
Commonly, AI or machine learning (ML) models are evaluated on benchmark datasets. This practice supports innovative methodological research, but benchmark performance can be poorly correlated with performance in real-world applications -- a construct
Externí odkaz:
http://arxiv.org/abs/2406.10366
Autor:
Yang, Yanjiao, Reiter, Jerome P.
Survey data collection often is plagued by unit and item nonresponse. To reduce reliance on strong assumptions about the missingness mechanisms, statisticians can use information about population marginal distributions known, for example, from census
Externí odkaz:
http://arxiv.org/abs/2406.04599
We present an approach for modeling and imputation of nonignorable missing data under Gaussian copulas. The analyst posits a set of quantiles of the marginal distributions of the study variables, for example, reflecting information from external data
Externí odkaz:
http://arxiv.org/abs/2406.03463
Autor:
Kazan, Zeki, Reiter, Jerome P.
We describe Bayesian inference for the parameters of Gaussian models of bounded data protected by differential privacy. Using this setting, we demonstrate that analysts can and should take constraints imposed by the bounds into account when specifyin
Externí odkaz:
http://arxiv.org/abs/2405.13801
Autor:
Binette, Olivier, Baek, Youngsoo, Engineer, Siddharth, Jones, Christina, Dasylva, Abel, Reiter, Jerome P.
Entity resolution (record linkage, microclustering) systems are notoriously difficult to evaluate. Looking for a needle in a haystack, traditional evaluation methods use sophisticated, application-specific sampling schemes to find matching pairs of r
Externí odkaz:
http://arxiv.org/abs/2404.05622
Autor:
Lin, Tong, Reiter, Jerome P.
Several official statistics agencies release synthetic data as public use microdata files. In practice, synthetic data do not admit accurate results for every analysis. Thus, it is beneficial for agencies to provide users with feedback on the quality
Externí odkaz:
http://arxiv.org/abs/2404.02519
Probabilistic record linkage is often used to match records from two files, in particular when the variables common to both files comprise imperfectly measured identifiers like names and demographic variables. We consider bipartite record linkage set
Externí odkaz:
http://arxiv.org/abs/2311.13923
Autor:
Wadekar, Adway S., Reiter, Jerome P.
Publikováno v:
Epidemiology (2024)
Surveys are commonly used to facilitate research in epidemiology, health, and the social and behavioral sciences. Often, these surveys are not simple random samples, and respondents are given weights reflecting their probability of selection into the
Externí odkaz:
http://arxiv.org/abs/2311.00596
Autor:
Bai, Eric A., Beckner, Madeleine, Ju, Botao, Reiter, Jerome P., Mouw, Ted, Merli, M. Giovanna
Many population surveys do not provide information on respondents' residential addresses, instead offering coarse geographies like zip code or higher aggregations. However, fine resolution geography can be beneficial for characterizing neighborhoods,
Externí odkaz:
http://arxiv.org/abs/2310.13907