A Novel Detection Method for High-Order SNP Epistatic Interactions Based on Explicit-Encoding-Based Multitasking Harmony Search.

Autor: Tuo S; School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China. tuo_sh@126.com.; Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an, 710121, China. tuo_sh@126.com.; Xi'an Key Laboratory of Big Data and Intelligent Computing, Xi'an, 710121, China. tuo_sh@126.com., Jiang J; School of Electronic Engineering, Xi'an University of Posts and Telecommunications, Xi'an, 710121, China.
Jazyk: angličtina
Zdroj: Interdisciplinary sciences, computational life sciences [Interdiscip Sci] 2024 Sep; Vol. 16 (3), pp. 688-711. Date of Electronic Publication: 2024 Jul 02.
DOI: 10.1007/s12539-024-00621-2
Abstrakt: To elucidate the genetic basis of complex diseases, it is crucial to discover the single-nucleotide polymorphisms (SNPs) contributing to disease susceptibility. This is particularly challenging for high-order SNP epistatic interactions (HEIs), which exhibit small individual effects but potentially large joint effects. These interactions are difficult to detect due to the vast search space, encompassing billions of possible combinations, and the computational complexity of evaluating them. This study proposes a novel explicit-encoding-based multitasking harmony search algorithm (MTHS-EE-DHEI) specifically designed to address this challenge. The algorithm operates in three stages. First, a harmony search algorithm is employed, utilizing four lightweight evaluation functions, such as Bayesian network and entropy, to efficiently explore potential SNP combinations related to disease status. Second, a G-test statistical method is applied to filter out insignificant SNP combinations. Finally, two machine learning-based methods, multifactor dimensionality reduction (MDR) as well as random forest (RF), are employed to validate the classification performance of the remaining significant SNP combinations. This research aims to demonstrate the effectiveness of MTHS-EE-DHEI in identifying HEIs compared to existing methods, potentially providing valuable insights into the genetic architecture of complex diseases. The performance of MTHS-EE-DHEI was evaluated on twenty simulated disease datasets and three real-world datasets encompassing age-related macular degeneration (AMD), rheumatoid arthritis (RA), and breast cancer (BC). The results demonstrably indicate that MTHS-EE-DHEI outperforms four state-of-the-art algorithms in terms of both detection power and computational efficiency. The source code is available at https://github.com/shouhengtuo/MTHS-EE-DHEI.git .
(© 2024. International Association of Scientists in the Interdisciplinary Areas.)
Databáze: MEDLINE