[A bioinformatic pipeline for NGS data analysis and mutation calling in human solid tumors].

Autor: Tsukanov KY; 'Genotek Ltd', Moscow, Russia., Krasnenko AY; 'Genotek Ltd', Moscow, Russia., Plakhina DA; 'Genotek Ltd', Moscow, Russia., Korostin DO; Vavilov Institute of General Genetics, Moscow, Russia., Churov AV; Institute of Biology of Karelian Research Centre, Petrozavodsk, Russia., Druzhilovskaya OS; Vavilov Institute of General Genetics, Moscow, Russia., Rebrikov DV; Vavilov Institute of General Genetics, Moscow, Russia., Ilinsky VV; 'Genotek Ltd', Moscow, Russia.
Jazyk: ruština
Zdroj: Biomeditsinskaia khimiia [Biomed Khim] 2017 Oct; Vol. 63 (5), pp. 413-417.
DOI: 10.18097/PBMC20176305413
Abstrakt: We aimed to develop a pipeline for the bioinformatic analysis and interpretation of NGS data and detection of a wide range of single-nucleotide somatic mutations within tumor DNA. Initially, the NGS reads were submitted to a quality control check by the Cutadapt program. Low-quality 3¢-nucleotides were removed. After that the reads were mapped to the reference genome hg19 (GRCh37.p13) by BWA. The SAMtools program was used for exclusion of duplicates. MuTect was used for SNV calling. The functional effect of SNVs was evaluated using the algorithm, including annotation and evaluation of SNV pathogenicity by SnpEff and analysis of such databases as COSMIC, dbNSFP, Clinvar, and OMIM. The effect of SNV on the protein function was estimated by SIFT and PolyPhen2. Mutation frequencies were obtained from 1000 Genomes and ExAC projects, as well as from our own databases with frequency data. In order to evaluate the pipeline we used 18 breast cancer tumor biopsies. The MYbaits Onconome KL v1.5 Panel ("MYcroarray") was used for targeted enrichment. NGS was performed on the Illumina HiSeq 2500 platform. As a result, we identified alterations in BRCA1, BRCA2, ATM, CDH1, CHEK2, TP53 genes that affected the sequence of encoded proteins. Our pipeline can be used for effective search and annotation of tumor SNVs. In this study, for the first time, we have tested this pipeline for NGS data analysis of samples from patients of the Russian population. However, further confirmation of efficiency and accuracy of the pipeline is required on NGS data from larger datasets as well as data from several types of solid tumors.
Databáze: MEDLINE