Stratified K-fold cross validation optimization on machine learning for prediction
Autor: | Slamet Widodo, Herlambang Brawijaya, Samudi Samudi |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2022 |
Předmět: | |
Zdroj: | Sinkron : jurnal dan penelitian teknik informatika; Vol. 7 No. 4 (2022): Article Research: Volume 7 Number 4, October 2022; 2407-2414 |
ISSN: | 2541-044X 2541-2019 |
DOI: | 10.33395/sinkron.v7i4 |
Popis: | Cervical is the second most common malignant tumor in women, with 341,000 deaths worldwide in 2020, almost 80% of which occur in developing countries. One of the causes is infection with Human papillomavirus (HPV) types 16 and 18. The increasing incidence of cervical cancer in Indonesia makes this disease must be treated seriously because it is one of the main causes of death. In addition to the virus, external factors can be one of the causes. The high mortality rate in patients is caused by the patient's awareness of the emergence of cervical cancer which is only seen when it enters the final stage. One of the efforts to reduce the number of sufferers is to implement cervical cancer detection. Early detection of cervical cancer can also be identified by looking at external factors, such as behavioral factors, intentions, attitudes, norms, perceptions, motivations, social support, and empowerment. However, the data used has an imbalance in the distribution of the target class, namely more negative samples than positive ones. To overcome this, a technique called Stratified K-Fold Cross-Validation (SKCV) is used. Evaluation of the accuracy value using the Confusion matrix to determine the performance of each model. The best performance of the five classification algorithms used is 96 percent (RF), 94 percent (LR), 92 percent (XGBoost), 90 percent (KNN), and 88 percent (NB). The results show that the model formed by RF-based SKCV has the highest accuracy of other models. |
Databáze: | OpenAIRE |
Externí odkaz: |