Anomaly identification during polymerase chain reaction for detecting SARS-cov-2 using artificial intelligence trained from simulated data

Autor: Leonardo C. Pacheco-Londoño, Nataly J. Galán-Freyle, Paola Amar-Sepulveda, Reynaldo Villarreal-González, Jaime A Garzon-Ochoa, Antonio J. Acosta-Hoyos
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Big Data
Artificial intelligence
Computer science
polymerase chain reaction
Big data
Pharmaceutical Science
Real-Time Polymerase Chain Reaction
Simulated data
Models
Biological

Article
Analytical Chemistry
lcsh:QD241-441
03 medical and health sciences
COVID-19 Testing
0302 clinical medicine
lcsh:Organic chemistry
Drug Discovery
False positive paradox
Humans
Physical and Theoretical Chemistry
Medical diagnosis
030304 developmental biology
0303 health sciences
Reverse Transcriptase Polymerase Chain Reaction
business.industry
SARS-CoV-2
Organic Chemistry
Reproducibility of Results
Experimental data
COVID-19
Gold standard (test)
simulated data
artificial intelligence
Polymerase chain reaction
Identification (information)
Binary classification
Chemistry (miscellaneous)
030220 oncology & carcinogenesis
Molecular Medicine
Data verification
business
Zdroj: Revista Molecules
Vol. 26, No. 1 (2021)
Molecules
Volume 26
Issue 1
Molecules, Vol 26, Iss 20, p 20 (2021)
Popis: Real-time reverse transcription (RT) PCR is the gold standard for detecting Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), owing to its sensitivity and specificity, thereby meeting the demand for the rising number of cases. The scarcity of trained molecular biologists for analyzing PCR results makes data verification a challenge. Artificial intelligence (AI) was designed to ease verification, by detecting atypical profiles in PCR curves caused by contamination or artifacts. Four classes of simulated real-time RT-PCR curves were generated, namely, positive, early, no, and abnormal amplifications. Machine learning (ML) models were generated and tested using small amounts of data from each class. The best model was used for classifying the big data obtained by the Virology Laboratory of Simon Bolivar University from real-time RT-PCR curves for SARS-CoV-2, and the model was retrained and implemented in a software that correlated patient data with test and AI diagnoses. The best strategy for AI included a binary classification model, which was generated from simulated data, where data analyzed by the first model were classified as either positive or negative and abnormal. To differentiate between negative and abnormal, the data were reevaluated using the second model. In the first model, the data required preanalysis through a combination of prepossessing. The early amplification class was eliminated from the models because the numbers of cases in big data was negligible. ML models can be created from simulated data using minimum available information. During analysis, changes or variations can be incorporated by generating simulated data, avoiding the incorporation of large amounts of experimental data encompassing all possible changes. For diagnosing SARS-CoV-2, this type of AI is critical for optimizing PCR tests because it enables rapid diagnosis and reduces false positives. Our method can also be used for other types of molecular analyses.
Databáze: OpenAIRE