Improving CNV Detection Performance in Microarray Data Using a Machine Learning-Based Approach.

Autor: Goh CJ; Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea., Kwon HJ; Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea.; Department of Computer Science and Engineering, Incheon National University (INU), Incheon 22012, Republic of Korea., Kim Y; Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea., Jung S; Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea., Park J; Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea., Lee IK; Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea.; Department of Computer Science and Engineering, Incheon National University (INU), Incheon 22012, Republic of Korea.; NGENI Foundation, San Diego, CA 92127, USA., Park BR; Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea., Kim MJ; Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea., Kim MJ; Diagnomics, Inc., 5795 Kearny Villa Rd., San Diego, CA 92123, USA., Lee MS; Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea.; Diagnomics, Inc., 5795 Kearny Villa Rd., San Diego, CA 92123, USA.
Jazyk: angličtina
Zdroj: Diagnostics (Basel, Switzerland) [Diagnostics (Basel)] 2023 Dec 29; Vol. 14 (1). Date of Electronic Publication: 2023 Dec 29.
DOI: 10.3390/diagnostics14010084
Abstrakt: Copy number variation (CNV) is a primary source of structural variation in the human genome, leading to several disorders. Therefore, analyzing neonatal CNVs is crucial for managing CNV-related chromosomal disabilities. However, genomic waves can hinder accurate CNV analysis. To mitigate the influences of the waves, we adopted a machine learning approach and developed a new method that uses a modified log R ratio instead of the commonly used log R ratio. Validation results using samples with known CNVs demonstrated the superior performance of our method. We analyzed a total of 16,046 Korean newborn samples using the new method and identified CNVs related to 39 genetic disorders were identified in 342 cases. The most frequently detected CNV-related disorder was Joubert syndrome 4. The accuracy of our method was further confirmed by analyzing a subset of the detected results using NGS and comparing them with our results. The utilization of a genome-wide single nucleotide polymorphism array with wave offset was shown to be a powerful method for identifying CNVs in neonatal cases. The accurate screening and the ability to identify various disease susceptibilities offered by our new method could facilitate the identification of CNV-associated chromosomal disease etiologies.
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje