Optimal ranking and directional signature classification using the integral strategy of multi-objective optimization-based association rule mining of multi-omics data.

Autor: Mallik S; Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA, United States.; Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States., Seth S; Department of Computer Science and Engineering, Brainware University, Kolkata, India.; Department of Computer Science and Engineering, Aliah University, Kolkata, India., Si A; School of Information Technology, Maulana Abul Kalam Azad University of Technology, Haringhata, India., Bhadra T; Department of Computer Science and Engineering, Aliah University, Kolkata, India., Zhao Z; Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States.; Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, United States.
Jazyk: angličtina
Zdroj: Frontiers in bioinformatics [Front Bioinform] 2023 Jul 27; Vol. 3, pp. 1182176. Date of Electronic Publication: 2023 Jul 27 (Print Publication: 2023).
DOI: 10.3389/fbinf.2023.1182176
Abstrakt: Introduction: Association rule mining (ARM) is a powerful tool for exploring the informative relationships among multiple items (genes) in any dataset. The main problem of ARM is that it generates many rules containing different rule-informative values, which becomes a challenge for the user to choose the effective rules. In addition, few works have been performed on the integration of multiple biological datasets and variable cutoff values in ARM. Methods: To solve all these problems, in this article, we developed a novel framework MOOVARM (multi-objective optimized variable cutoff-based association rule mining) for multi-omics profiles. Results: In this regard, we identified the positive ideal solution ( PIS ), which maximized the profit and minimized the loss, and negative ideal solution ( NIS ), which minimized the profit and maximized the loss for all gene sets (item sets), belonging to each extracted rule. Thereafter, we computed the distance ( d +) from PIS and distance ( d -) from NIS for each gene set or product. These two distances played an important role in determining the optimized associations among various pairs of genes in the multi-omics dataset. We then globally estimated the relative closeness to PIS for ranking the gene sets. When the relative closeness score of the rule is greater than or equal to the pre-defined threshold value, the rule can be considered a final resultant rule. Moreover, MOOVARM evaluated the relative score of the rule based on the status of all genes instead of individual genes. Conclusions: MOOVARM produced the final rank of the extracted (multi-objective optimized) rules of correlated genes which had better disease classification than the state-of-the-art algorithms on gene signature identification.
Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
(Copyright © 2023 Mallik, Seth, Si, Bhadra  and Zhao.)
Databáze: MEDLINE