Pilot-Study to Explore Metabolic Signature of Type 2 Diabetes: A Pipeline of Tree-Based Machine Learning and Bioinformatics Techniques for Biomarkers Discovery

Autor:	Fatma Hilal Yagin, Fahaid Al-Hashem, Irshad Ahmad, Fuzail Ahmad, Abedalrhman Alkhateeb
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	type 2 diabetes biomarker discovery metabolomics machine learning bioinformatics Nutrition. Foods and food supply TX341-641
Zdroj:	Nutrients, Vol 16, Iss 10, p 1537 (2024)
Druh dokumentu:	article
ISSN:	16101537 2072-6643
DOI:	10.3390/nu16101537
Popis:	Background: This study aims to identify unique metabolomics biomarkers associated with Type 2 Diabetes (T2D) and develop an accurate diagnostics model using tree-based machine learning (ML) algorithms integrated with bioinformatics techniques. Methods: Univariate and multivariate analyses such as fold change, a receiver operating characteristic curve (ROC), and Partial Least-Squares Discriminant Analysis (PLS-DA) were used to identify biomarker metabolites that showed significant concentration in T2D patients. Three tree-based algorithms [eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Adaptive Boosting (AdaBoost)] that demonstrated robustness in high-dimensional data analysis were used to create a diagnostic model for T2D. Results: As a result of the biomarker discovery process validated with three different approaches, Pyruvate, D-Rhamnose, AMP, pipecolate, Tetradecenoic acid, Tetradecanoic acid, Dodecanediothioic acid, Prostaglandin E3/D3 (isobars), ADP and Hexadecenoic acid were determined as potential biomarkers for T2D. Our results showed that the XGBoost model [accuracy = 0.831, F1-score = 0.845, sensitivity = 0.882, specificity = 0.774, positive predictive value (PPV) = 0.811, negative-PV (NPV) = 0.857 and Area under the ROC curve (AUC) = 0.887] had the slight highest performance measures. Conclusions: ML integrated with bioinformatics techniques offers accurate and positive T2D candidate biomarker discovery. The XGBoost model can successfully distinguish T2D based on metabolites.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/01c8cc5e34c54ec1b171d1086feecea7 Zobrazit plný text záznamu View record in DOAJ