Development of a Preliminary Screening Tool for Predicting Polycystic Ovarian Syndrome using Machine Learning and Deep Learning Models with Non Invasive Qualitative Features: A Case-control Study

Autor: Hanumanth Narni, Vasudeva Rao Ananthasetty, SD Jilani
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Journal of Clinical and Diagnostic Research, Vol 18, Iss 12, Pp 06-10 (2024)
Druh dokumentu: article
ISSN: 2249-782X
0973-709X
DOI: 10.7860/JCDR/2024/75199.20403
Popis: Introduction: Polycystic Ovarian Syndrome (PCOS) is a prevalent endocrine disorder affecting women of reproductive age, characterised by irregular menstrual cycles, hyperandrogenism and polycystic ovaries. Despite its high prevalence, the diagnosis of PCOS remains challenging due to the variability in symptom presentation. Traditional diagnostic methods involve clinical evaluation, biochemical assays and ultrasound imaging. Machine Learning (ML) and Deep Learning (DL) models offer promising avenues for predicting probable cases of PCOS using non invasive qualitative features. Aim: To develop and compare the performance of Random Forest (RF) and Feedforward Neural Network (FFNN) models in predicting PCOS using abundant non invasive qualitative features. Materials and Methods: A retrospective case-control study was conducted with 100 cases and 100 controls, selected based on ultrasound-confirmed PCOS diagnosis in the Obstetrics and Gynaecology, Gayatri Vidya Parishad Institute of Healthcare and Medical Technology (GVP IHC MT), Medical College departments from February 2024 to October 2024. Data were collected using a structured questionnaire capturing demographic and clinical variables. Feature selection was performed using the Chi-square filter method, with 10 features identified as significant. The data were split into training (80%) and testing (20%) sets and stratified 5-fold cross-validation was applied. Model performance was evaluated using accuracy, precision, recall, F1 score and Area Under Curve (AUC). Results: The RF model demonstrated high performance on the training set, with an average accuracy of 0.95, but exhibited variability on the testing set (accuracy of 0.80). The FFNN model showed consistent performance across both training (accuracy of 0.80) and testing datasets (accuracy of 0.82). The RF model identified irregular cycles and hirsutism as key predictors, while the FFNN model highlighted weight gain and abnormal Body Mass Index (BMI) as important features. The RF model required significantly less computational time compared to the FFNN model. Conclusion: The RF model is preferable for tasks requiring computational efficiency, while the FFNN model offers better generalisation. The complementary feature importance rankings suggest that integrating insights from both models could enhance the understanding of PCOS predictors. In epidemiological investigations, these models can be used as preliminary screening tools for identifying probable cases of PCOS using non invasive qualitative features, especially in areas where diagnostic facilities are not available.
Databáze: Directory of Open Access Journals