Prediction of Methicillin Resistance Staphylococcus aureus by Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry
Autor: | Bo-Yu Chu, 朱柏宇 |
---|---|
Rok vydání: | 2017 |
Druh dokumentu: | 學位論文 ; thesis |
Popis: | 105 Methicillin‐resistant Staphylococcus aureus (MRSA) is a Super Bug developing from Staphylococcus aureus. MRSA strains have resistance to many kinds of antibiotics. We need to find out patient is infected by MRSA or common Staphylococcus aureus called methicillin‐sensitive Staphylococcus aureus (MSSA) before the treatment. In this study, MRSA prediction models based on MALDI-TOF-MS were to find template peaks from MRSA and MSSA Mass Spectra. The premise of bacterial identification by the MALDI-TOF MS approach was to generate a spectral profile called Mass Spectra from abundant bacterial proteins, the majority of which are ribosomal protein. In such cases, mass-to-charge ratio (M/Z) of the ribosomal proteins calculated by MALDI-TOF-MS may have errors because of the experiment environment or manual operation. The observed M/Z value might be more or less than the real mass of the protein. To solve this problem, we used binning method to set twelve kinds of error ranges 1 M/Z - 12 M/Z. The prediction models were designed using machine learning (ML) methods, namely support vector machine (SVM), k-nearest neighbor (KNN), decision tree (J48), and Random Forest (RF). This investigation obtained clinicopathological features, including thousands of peaks from 3882 Staphylococcus aureus Mass Spectra. Two kinds of feature selection methods Pearson correlation coefficient (PCC) and One Rule (OneR) Attribute Evaluation were applied to select robust features. According to the evaluation of cross-validation, the accuracy, MCC, sensitivity, and specificity of various models were compared. To design the prediction models, 43 peaks were selected by PCC with bin size 10 M/Z. In evaluation of model performance, the random forest model significantly outperformed other classifiers based on the same training features and intensity information. The random forest model exhibits 78.87% accuracy, 76.3% sensitivity, 81.7% specificity, and 0.58 MCC. This investigation has demonstrated that the selected peaks were effective in the prediction of MRSA. Many peaks that we selected as features by feature selection methods have already been discovered the difference between MRSA and MSSA in many researches. |
Databáze: | Networked Digital Library of Theses & Dissertations |
Externí odkaz: |