A GLM-based zero-inflated generalized Poisson factor model for analyzing microbiome data

Autor: Jinling Chi, Jimin Ye, Ying Zhou
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Frontiers in Microbiology, Vol 15 (2024)
Druh dokumentu: article
ISSN: 1664-302X
DOI: 10.3389/fmicb.2024.1394204
Popis: MotivationHigh-throughput sequencing technology facilitates the quantitative analysis of microbial communities, improving the capacity to investigate the associations between the human microbiome and diseases. Our primary motivating application is to explore the association between gut microbes and obesity. The complex characteristics of microbiome data, including high dimensionality, zero inflation, and over-dispersion, pose new statistical challenges for downstream analysis.ResultsWe propose a GLM-based zero-inflated generalized Poisson factor analysis (GZIGPFA) model to analyze microbiome data with complex characteristics. The GZIGPFA model is based on a zero-inflated generalized Poisson (ZIGP) distribution for modeling microbiome count data. A link function between the generalized Poisson rate and the probability of excess zeros is established within the generalized linear model (GLM) framework. The latent parameters of the GZIGPFA model constitute a low-rank matrix comprising a low-dimensional score matrix and a loading matrix. An alternating maximum likelihood algorithm is employed to estimate the unknown parameters, and cross-validation is utilized to determine the rank of the model in this study. The proposed GZIGPFA model demonstrates superior performance and advantages through comprehensive simulation studies and real data applications.
Databáze: Directory of Open Access Journals