Super learner model for classifying leukemia through gene expression monitoring

Autor: Sharanya Selvaraj, Alhuseen Omar Alsayed, Nor Azman Ismail, Balasubramanian Prabhu Kavin, Edeh Michael Onyema, Gan Hong Seng, Arinze Queen Uchechi
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Discover Oncology, Vol 15, Iss 1, Pp 1-13 (2024)
Druh dokumentu: article
ISSN: 2730-6011
DOI: 10.1007/s12672-024-01337-x
Popis: Abstract Leukemia is a form of cancer that affects the bone marrow and lymphatic system, and it requires complex treatment strategies that vary with each subtype. Due to the subtle morphological differences among these types, monitoring gene expressions is crucial for accurate classification. Manual or pathological testing can be time-consuming and expensive. Therefore, data-driven methods and machine learning algorithms offer an efficient alternative for leukemia classification. This study introduced a novel super learning model that leverages heterogeneous machine learning models to analyze gene expression data and classify leukemia cells. The proposed approach incorporates an entropy-based feature importance technique to identify the gene profiles most significant to the labeling process. The strength of this super learning model lies in its final super learner, Random Forest, which effectively classifies cross-validated data from the candidate learners. Validation on a gene expression monitoring dataset demonstrates that this model outperforms other state-of-the-art models in predictive accuracy. The study contributes to the knowledge regarding the use of advanced machine learning techniques to improve the accuracy and reliability of leukemia classification using gene expression data, addressing the challenges of traditional methods that rely on clinical features and morphological examination.
Databáze: Directory of Open Access Journals