A Bayesian zero-inflated Dirichlet-multinomial regression model for multivariate compositional count data.
Autor: | Koslovsky MD; Department of Statistics, Colorado State University, Fort Collins, Colorado, USA. |
---|---|
Jazyk: | angličtina |
Zdroj: | Biometrics [Biometrics] 2023 Dec; Vol. 79 (4), pp. 3239-3251. Date of Electronic Publication: 2023 Apr 03. |
DOI: | 10.1111/biom.13853 |
Abstrakt: | The Dirichlet-multinomial (DM) distribution plays a fundamental role in modern statistical methodology development and application. Recently, the DM distribution and its variants have been used extensively to model multivariate count data generated by high-throughput sequencing technology in omics research due to its ability to accommodate the compositional structure of the data as well as overdispersion. A major limitation of the DM distribution is that it is unable to handle excess zeros typically found in practice which may bias inference. To fill this gap, we propose a novel Bayesian zero-inflated DM model for multivariate compositional count data with excess zeros. We then extend our approach to regression settings and embed sparsity-inducing priors to perform variable selection for high-dimensional covariate spaces. Throughout, modeling decisions are made to boost scalability without sacrificing interpretability or imposing limiting assumptions. Extensive simulations and an application to a human gut microbiome dataset are presented to compare the performance of the proposed method to existing approaches. We provide an accompanying R package with a user-friendly vignette to apply our method to other datasets. (© 2023 The Authors. Biometrics published by Wiley Periodicals LLC on behalf of International Biometric Society.) |
Databáze: | MEDLINE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |