Automated invasive cervical cancer disease detection at early stage through suitable machine learning model
Autor: | Manowarul Islam, Bikash Kumar Paul, Ayesha Aziz Prova, Tamanna Yesmin Rashme, Linta Islam, Mohammed Khaled Mosharof, M. D. Saimun Islam, Sohely Jahan |
---|---|
Rok vydání: | 2021 |
Předmět: |
Technology
Computer science Science General Chemical Engineering General Physics and Astronomy Feature selection Machine learning computer.software_genre Features selection Feature (machine learning) medicine Multilayer perceptron General Materials Science AdaBoost General Environmental Science Cervical cancer business.industry General Engineering Classification medicine.disease Random forest Statistical classification SVC General Earth and Planetary Sciences Early-stage detection Artificial intelligence Gradient boosting business computer |
Zdroj: | SN Applied Sciences, Vol 3, Iss 10, Pp 1-17 (2021) |
ISSN: | 2523-3971 2523-3963 |
DOI: | 10.1007/s42452-021-04786-z |
Popis: | Cervical cancer is a common cancer that affects women all over the world. This is the fourth leading cause of death among women and has no symptoms in its early stages. At the cervix, cervical cancer cells develop slowly. If it can be detected early, this cancer can be successfully treated. Health professionals are now facing a major challenge in detecting such cancer until it spreads rapidly. This study applied various machine learning classification methods to predict cervical cancer using risk factors. The main aim of this research work is to be described of the performance variation of eight most classifications algorithm to detect cervical cancer disease based on the selection of various top features sets from the dataset. Multilayer Perceptron (MLP), Random Forest and k-Nearest Neighbor, Decision Tree, Logistic Regression, SVC, Gradient Boosting, AdaBoost are examples of machine learning classification algorithms that have been used to predict cervical cancer and help in early diagnosis. A variety of approaches are used to avoid missing values in the dataset. To choose the various best features, a combination of feature selection techniques such as Chi-square, SelectBest and Random Forest was used. The performance of those classifications is evaluated using the accuracy, recall, precision and f1-score parameters. On a variety of top feature sets, MLP outperformed other classification models. The majority of classification models, on the other hand, claim to have the highest accuracy on the top 25 features in dataset splitting ratio (70:30). For each model, the percentage of correctly classified instances has been presented and all of the results are then discussed. Medical professionals will be able to use the suggested approach to perform research on cervical cancer. |
Databáze: | OpenAIRE |
Externí odkaz: |