Predicting the depression in university students using stacking ensemble techniques over oversampling method

Autor: Alfredo Daza Vergaray, Juan Carlos Herrera Miranda, Juana Bobadilla Cornelio, Atilio Rubén López Carranza, Carlos Fidel Ponce Sánchez
Jazyk: angličtina
Rok vydání: 2023
Předmět:
Zdroj: Informatics in Medicine Unlocked, Vol 41, Iss , Pp 101295- (2023)
Druh dokumentu: article
ISSN: 2352-9148
DOI: 10.1016/j.imu.2023.101295
Popis: Background: Depression is that mental health disorder characterized by constant sadness for approximately 2 weeks, in which it generates an inability to do daily activities, and those affected lose interest in doing the things they previously enjoyed. About 1 billion people have mental disorders and more than 300 million people have depression globally. To predict depression, the use of machine learning techniques is essential, being helpful in obtaining automatic processes and creating models that help analyze and solve a problem. Objective: The objective of the study was to propose a method and 3 combined models based on Stacking to predict depression in university students of a public university. Methods: The dataset was composed of Computer and Systems Engineering students from a public university (n = 284). Then cleaning and pre-processing was performed, where the data was reviewed using the Python program. In the balancing of the data, the data were divided into 5 values obtained and the oversampling method was performed, distributing the data according to the condition. Then we proceeded to partition the balanced data, while using the Cross validation method for data training. For the model and evaluation, 4 independent algorithms were used, and based on these 3 combined models were proposed. Results: Of the proposed combined models Ensemble Stacking 1 and Stacking 2 achieved the best Accuracy and ROC Curve score -micro and score-macro with 94.69% and 100.00%. In the same way with respect to sensitivity, Stacking 1 obtained the best sensitivity, accuracy and F1-Score, these being 94.22%, 94.09% and 94.12% respectively. Conclusions: This study emphasizes the application of the Ensemble Stacking method to detect depression early in students of a public university in Peru. With this technology, when using the combined method, it was possible to observe an improvement in the performance of the process for the prediction of depression, unlike performing it with independent algorithms.
Databáze: Directory of Open Access Journals