Anomaly identification during polymerase chain reaction for detecting SARS-cov-2 using artificial intelligence trained from simulated data
Autor: | Leonardo C. Pacheco-Londoño, Nataly J. Galán-Freyle, Paola Amar-Sepulveda, Reynaldo Villarreal-González, Jaime A Garzon-Ochoa, Antonio J. Acosta-Hoyos |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Big Data
Artificial intelligence Computer science polymerase chain reaction Big data Pharmaceutical Science Real-Time Polymerase Chain Reaction Simulated data Models Biological Article Analytical Chemistry lcsh:QD241-441 03 medical and health sciences COVID-19 Testing 0302 clinical medicine lcsh:Organic chemistry Drug Discovery False positive paradox Humans Physical and Theoretical Chemistry Medical diagnosis 030304 developmental biology 0303 health sciences Reverse Transcriptase Polymerase Chain Reaction business.industry SARS-CoV-2 Organic Chemistry Reproducibility of Results Experimental data COVID-19 Gold standard (test) simulated data artificial intelligence Polymerase chain reaction Identification (information) Binary classification Chemistry (miscellaneous) 030220 oncology & carcinogenesis Molecular Medicine Data verification business |
Zdroj: | Revista Molecules Vol. 26, No. 1 (2021) Molecules Volume 26 Issue 1 Molecules, Vol 26, Iss 20, p 20 (2021) |
Popis: | Real-time reverse transcription (RT) PCR is the gold standard for detecting Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), owing to its sensitivity and specificity, thereby meeting the demand for the rising number of cases. The scarcity of trained molecular biologists for analyzing PCR results makes data verification a challenge. Artificial intelligence (AI) was designed to ease verification, by detecting atypical profiles in PCR curves caused by contamination or artifacts. Four classes of simulated real-time RT-PCR curves were generated, namely, positive, early, no, and abnormal amplifications. Machine learning (ML) models were generated and tested using small amounts of data from each class. The best model was used for classifying the big data obtained by the Virology Laboratory of Simon Bolivar University from real-time RT-PCR curves for SARS-CoV-2, and the model was retrained and implemented in a software that correlated patient data with test and AI diagnoses. The best strategy for AI included a binary classification model, which was generated from simulated data, where data analyzed by the first model were classified as either positive or negative and abnormal. To differentiate between negative and abnormal, the data were reevaluated using the second model. In the first model, the data required preanalysis through a combination of prepossessing. The early amplification class was eliminated from the models because the numbers of cases in big data was negligible. ML models can be created from simulated data using minimum available information. During analysis, changes or variations can be incorporated by generating simulated data, avoiding the incorporation of large amounts of experimental data encompassing all possible changes. For diagnosing SARS-CoV-2, this type of AI is critical for optimizing PCR tests because it enables rapid diagnosis and reduces false positives. Our method can also be used for other types of molecular analyses. |
Databáze: | OpenAIRE |
Externí odkaz: |