Development, study, and comparison of models of cross-immunity to the influenza virus using statistical methods and machine learning

Autor:	Marina N. Asatryan, Ilya S. Shmyr, Boris I. Timofeev, Dmitrii N. Shcherbinin, Vaagn G. Agasaryan, Tatiana A. Timofeeva, Ivan F. Ershov, Elita R. Gerasimuk, Anna V. Nozdracheva, Tatyana A. Semenenko, Denis Yu. Logunov, Aleksander L. Gintsburg
Jazyk:	English<br />Russian
Rok vydání:	2024
Předmět:	influenza a virus subtype h3n2 antibody titers in hia cross immunity antigenic distance antigenic site hamming distance aaindex databases logistic regression random forest method gradient boosting epidemiological model immune landscape vaccine strain machine learning methods Microbiology QR1-502
Zdroj:	Вопросы вирусологии, Vol 69, Iss 4, Pp 349-362 (2024)
Druh dokumentu:	article
ISSN:	0507-4088 2411-2097
DOI:	10.36233/0507-4088-250
Popis:	Introduction. The World Health Organization considers the values of antibody titers in the hemagglutination inhibition assay as one of the most important criteria for assessing successful vaccination. Mathematical modeling of cross-immunity allows for identification on a real-time basis of new antigenic variants, which is of paramount importance for human health. Materials and methods. This study uses statistical methods and machine learning techniques from simple to complex: logistic regression model, random forest method, and gradient boosting. The calculations used the AAindex matrices in parallel to the Hamming distance. The calculations were carried out with different types and values of antigenic escape thresholds, on four data sets. The results were compared using common binary classification metrics. Results. Significant differentiation is shown depending on the data sets used. The best results were demonstrated by all three models for the forecast autumn season of 2022, which were preliminary trained on the February season of the same year (Auroc 0.934; 0.958; 0.956, respectively). The lowest results were obtained for the entire forecast year 2023, they were set up on data from two seasons of 2022 (Aucroc 0.614; 0.658; 0.775). The dependence of the results on the types of thresholds used and their values turned out to be insignificant. The additional use of AAindex matrices did not significantly improve the results of the models without introducing significant deterioration. Conclusion. More complex models show better results. When developing cross-immunity models, testing on a variety of data sets is important to make strong claims about their prognostic robustness.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/74daf3ff47a14a918fe07fefb7c30524 Zobrazit plný text záznamu View record in DOAJ