Autor: |
Awoleke, Obadare O., Sachdev, Kapil, Brown, Kevin A. |
Zdroj: |
International Journal of Data Science and Analytics; 20240101, Issue: Preprints p1-18, 18p |
Abstrakt: |
Supercomputer network traffic data have been traditionally modeled using time-consuming computational models and black-box machine learning algorithms. We address this problem using a new approach, that is, the combination of simple closed-form mathematical expressions with approximate Bayesian computation (ABC). The use of these simple expressions enables us to have a white-box approach as opposed to the black-box approach in machine learning, and it also increases the probability that these models would be utilized by the wider scientific community because of their simplicity and relative ease of application. The model that best fits the data and provides the best prediction is the logistic function model. The logistic function model was able to provide forecasts with less than 5% error using one-tenth of the available data (2000 ns) for the UR traffic pattern and provide forecast with approximately 5% error using 15% of the available data (3000 ns) for the NG traffic pattern. The modified Gaussian variogram function provided the next best fit, while the modified exponential variogram model resulted in the largest prediction errors. Regardless, the aleatory uncertainty band for all the models encompassed the actual cumulative bytes sent in all the scenarios considered in our work. Accordingly, an additional conclusion is that if the goal is to compute a probabilistic interval that encompasses the true value of the cumulative bytes sent through the network, the analyst can consider the aleatory uncertainty associated with any of the investigated models. |
Databáze: |
Supplemental Index |
Externí odkaz: |
|