Scalable Nonparametric Prescreening Method for Searching Higher-Order Genetic Interactions Underlying Quantitative Traits

Autor: Mikko J. Sillanpää, Juho A. J. Kontio
Rok vydání: 2019
Předmět:
Zdroj: Genetics
ISSN: 1943-2631
DOI: 10.1534/genetics.119.302658
Popis: The Gaussian process (GP) regression is theoretically capable of capturing higher-order gene-by-gene interactions important to trait variation non-exhaustively with high accuracy. Unfortunately, GP approach is scalable only for 100-200 genes and thus, not applicable for high... Gaussian process (GP)-based automatic relevance determination (ARD) is known to be an efficient technique for identifying determinants of gene-by-gene interactions important to trait variation. However, the estimation of GP models is feasible only for low-dimensional datasets (∼200 variables), which severely limits application of the GP-based ARD method for high-throughput sequencing data. In this paper, we provide a nonparametric prescreening method that preserves virtually all the major benefits of the GP-based ARD method and extends its scalability to the typical high-dimensional datasets used in practice. In several simulated test scenarios, the proposed method compared favorably with existing nonparametric dimension reduction/prescreening methods suitable for higher-order interaction searches. As a real-data example, the proposed method was applied to a high-throughput dataset downloaded from the cancer genome atlas (TCGA) with measured expression levels of 16,976 genes (after preprocessing) from patients diagnosed with acute myeloid leukemia.
Databáze: OpenAIRE