Analysis of learning curves in predictive modeling using exponential curve fitting with an asymptotic approach.

Autor: Vianna LS; Graduate Program in Knowledge Engineering, Management, and Media, Federal University of Santa Catarina, Florianópolis, Santa Catarina, Brazil., Gonçalves AL; Graduate Program in Knowledge Engineering, Management, and Media, Federal University of Santa Catarina, Florianópolis, Santa Catarina, Brazil., Souza JA; Graduate Program in Knowledge Engineering, Management, and Media, Federal University of Santa Catarina, Florianópolis, Santa Catarina, Brazil.
Jazyk: angličtina
Zdroj: PloS one [PLoS One] 2024 Apr 18; Vol. 19 (4), pp. e0299811. Date of Electronic Publication: 2024 Apr 18 (Print Publication: 2024).
DOI: 10.1371/journal.pone.0299811
Abstrakt: The existence of large volumes of data has considerably alleviated concerns regarding the availability of sufficient data instances for machine learning experiments. Nevertheless, in certain contexts, addressing limited data availability may demand distinct strategies and efforts. Analyzing COVID-19 predictions at pandemic beginning emerged a question: how much data is needed to make reliable predictions? When does the volume of data provide a better understanding of the disease's evolution and, in turn, offer reliable forecasts? Given these questions, the objective of this study is to analyze learning curves obtained from predicting the incidence of COVID-19 in Brazilian States using ARIMA models with limited available data. To fulfill the objective, a retrospective exploration of COVID-19 incidence across the Brazilian States was performed. After the data acquisition and modeling, the model errors were assessed by employing a learning curve analysis. The asymptotic exponential curve fitting enabled the evaluation of the errors in different points, reflecting the increased available data over time. For a comprehensive understanding of the results at distinct stages of the time evolution, the average derivative of the curves and the equilibrium points were calculated, aimed to identify the convergence of the ARIMA models to a stable pattern. We observed differences in average derivatives and equilibrium values among the multiple samples. While both metrics ultimately confirmed the convergence to stability, the equilibrium points were more sensitive to changes in the models' accuracy and provided a better indication of the learning progress. The proposed method for constructing learning curves enabled consistent monitoring of prediction results, providing evidence-based understandings required for informed decision-making.
Competing Interests: The authors have declared that no competing interests exist.
(Copyright: © 2024 Vianna et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje