Classification Study and Prediction of Cervical Cancer

Autor: Kaushik Suresh
Rok vydání: 2020
Předmět:
Zdroj: Advances in Intelligent Systems and Computing ISBN: 9789811535130
DOI: 10.1007/978-981-15-3514-7_27
Popis: The cancer disease has created a grave threat to the life of the people over more than a decade. As time increases, the problems and cases related tend to grow at a higher scale with respect to the different lifestyles lead by people. One such cancer, which plays a dominant role in the death of many women, is cervical cancer. This study helps us in classifying the presence of cervical cancer among subjects using machine learning classification approaches, such as logistic regression, decision tree, k nearest neighbor (KNN), random forest and support vector machine by constructing effective ensemble models and overcoming the biased nature of the class distribution in the dataset in order to identify the best classifier model to identify the presence of cervical cancer using the behavioral data of the subjects. Models are constructed based on the different types of input factors, such as lifestyle, habits, medical history of the patients and sexual practice of the patient. Among all the models constructed using different machine learning classification algorithms KNN algorithm provides the best performance in classifying the patients under two categories, such as affected by cervical cancer and not affected by cervical cancer for multiple target variables, like Hinselmann test, Schiller, cytology and biopsy test and discussing the important features in the dataset responsible for the cause of cervical cancer by identifying the variable importance using random forest approach.
Databáze: OpenAIRE