Development of an absolute assignment predictor for triple-negative breast cancer subtyping using machine learning approaches.

Autor: Ben Azzouz F; Unité de Bioinfomique, Institut de Cancérologie de L'Ouest, Bd Jacques Monod, 44805, Saint Herblain Cedex, France; SIRIC ILIAD, Nantes, Angers, France., Michel B; Unité de Bioinfomique, Institut de Cancérologie de L'Ouest, Bd Jacques Monod, 44805, Saint Herblain Cedex, France; SIRIC ILIAD, Nantes, Angers, France; Ecole Centrale de Nantes, 1 Rue de La Noë, 44300, Nantes, France; Laboratoire de Mathématiques Jean Leray, BP 92208, 2 Rue de La Houssinière, 44322, Nantes Cedex 03, France., Lasla H; Unité de Bioinfomique, Institut de Cancérologie de L'Ouest, Bd Jacques Monod, 44805, Saint Herblain Cedex, France; SIRIC ILIAD, Nantes, Angers, France., Gouraud W; Unité de Bioinfomique, Institut de Cancérologie de L'Ouest, Bd Jacques Monod, 44805, Saint Herblain Cedex, France; SIRIC ILIAD, Nantes, Angers, France., François AF; Ecole Centrale de Nantes, 1 Rue de La Noë, 44300, Nantes, France., Girka F; Ecole Centrale de Nantes, 1 Rue de La Noë, 44300, Nantes, France., Lecointre T; Ecole Centrale de Nantes, 1 Rue de La Noë, 44300, Nantes, France., Guérin-Charbonnel C; Unité de Bioinfomique, Institut de Cancérologie de L'Ouest, Bd Jacques Monod, 44805, Saint Herblain Cedex, France; SIRIC ILIAD, Nantes, Angers, France., Juin PP; SIRIC ILIAD, Nantes, Angers, France; CRCINA, INSERM, CNRS, Université de Nantes, Université D'Angers, Institut de Recherche en Santé-Université de Nantes, 8 Quai Moncousu - BP 70721, 44007, Nantes Cedex 1, France., Campone M; SIRIC ILIAD, Nantes, Angers, France; CRCINA, INSERM, CNRS, Université de Nantes, Université D'Angers, Institut de Recherche en Santé-Université de Nantes, 8 Quai Moncousu - BP 70721, 44007, Nantes Cedex 1, France; Oncologie Médicale, Institut de Cancérologie de L'Ouest - René Gauducheau, Bd Jacques Monod, 44805, Saint Herblain Cedex, France., Jézéquel P; Unité de Bioinfomique, Institut de Cancérologie de L'Ouest, Bd Jacques Monod, 44805, Saint Herblain Cedex, France; SIRIC ILIAD, Nantes, Angers, France; CRCINA, INSERM, CNRS, Université de Nantes, Université D'Angers, Institut de Recherche en Santé-Université de Nantes, 8 Quai Moncousu - BP 70721, 44007, Nantes Cedex 1, France. Electronic address: pascal.jezequel@ico.unicancer.fr.
Jazyk: angličtina
Zdroj: Computers in biology and medicine [Comput Biol Med] 2021 Feb; Vol. 129, pp. 104171. Date of Electronic Publication: 2020 Dec 09.
DOI: 10.1016/j.compbiomed.2020.104171
Abstrakt: Triple-negative breast cancer (TNBC) heterogeneity represents one of the main obstacles to precision medicine for this disease. Recent concordant transcriptomics studies have shown that TNBC could be divided into at least three subtypes with potential therapeutic implications. Although a few studies have been conducted to predict TNBC subtype using transcriptomics data, the subtyping was partially sensitive and limited by batch effect and dependence on a given dataset, which may penalize the switch to routine diagnostic testing. Therefore, we sought to build an absolute predictor (i.e., intra-patient diagnosis) based on machine learning algorithms with a limited number of probes. To that end, we started by introducing probe binary comparison for each patient (indicators). We based the predictive analysis on this transformed data. Probe selection was first involved combining both filter and wrapper methods for variable selection using cross-validation. We tested three prediction models (random forest, gradient boosting [GB], and extreme gradient boosting) using this optimal subset of indicators as inputs. Nested cross-validation consistently allowed us to choose the best model. The results showed that the fifty selected indicators highlighted the biological characteristics associated with each TNBC subtype. The GB based on this subset of indicators performs better than other models.
(Copyright © 2020 Elsevier Ltd. All rights reserved.)
Databáze: MEDLINE