Identification of intermediate-sized deletions and inference of their impact on gene expression in a human population

Autor: Yukiko Yoshii, Akihiro Fujimoto, Hidewaki Nakagawa, Daichi Shigemizu, Shintaro Akiyama, Jing Hao Wong, Shu Narumiya, Azusa Tanaka
Rok vydání: 2019
Předmět:
0301 basic medicine
lcsh:QH426-470
Intermediate-sized deletion
Quantitative Trait Loci
Population
lcsh:Medicine
Regulatory Sequences
Nucleic Acid

Biology
Genome
Expression quantitative trait loci (eQTL)
Genomic imputation
03 medical and health sciences
0302 clinical medicine
Genetic variation
Genetics
Humans
Long-read sequencing
education
Molecular Biology
Gene
Phylogeny
Genetics (clinical)
Sequence Deletion
Genetic association
education.field_of_study
Whole Genome Sequencing
Genome
Human

Research
lcsh:R
Computational Biology
High-Throughput Nucleotide Sequencing
Reproducibility of Results
Epistasis
Genetic

Molecular Sequence Annotation
Genomics
Human genetics
lcsh:Genetics
Genetics
Population

030104 developmental biology
Gene Expression Regulation
030220 oncology & carcinogenesis
Expression quantitative trait loci
Molecular Medicine
CRISPR-Cas Systems
Imputation (genetics)
Zdroj: Genome Medicine
Genome Medicine, Vol 11, Iss 1, Pp 1-15 (2019)
ISSN: 1756-994X
DOI: 10.1186/s13073-019-0656-4
Popis: Background Next-generation sequencing has allowed for the identification of different genetic variations, which are known to contribute to diseases. Of these, insertions and deletions are the second most abundant type of variations in the genome, but their biological importance or disease association is not well-studied, especially for deletions of intermediate sizes. Methods We identified intermediate-sized deletions from whole-genome sequencing (WGS) data of Japanese samples (n = 174) with a novel deletion calling method which considered multiple samples. These deletions were used to construct a reference panel for use in imputation. Imputation was then conducted using the reference panel and data from 82 publically available Japanese samples with gene expression data. The accuracy of the deletion calling and imputation was examined with Nanopore long-read sequencing technology. We also conducted an expression quantitative trait loci (eQTL) association analysis using the deletions to infer their functional impacts on genes, before characterizing the deletions causal for gene expression level changes. Results We obtained a set of polymorphic 4378 high-confidence deletions and constructed a reference panel. The deletions were successfully imputed into the Japanese samples with high accuracy (97.3%). The eQTL analysis identified 181 deletions (4.1%) suggested as causal for gene expression level changes. The causal deletion candidates were significantly enriched in promoters, super-enhancers, and transcription elongation chromatin states. Generation of deletions in a cell line with the CRISPR-Cas9 system confirmed that they were indeed causative variants for gene expression change. Furthermore, one of the deletions was observed to affect the gene expression levels of a gene it was not located in. Conclusions This paper reports an accurate deletion calling method for genotype imputation at the whole genome level and shows the importance of intermediate-sized deletions in the human population. Electronic supplementary material The online version of this article (10.1186/s13073-019-0656-4) contains supplementary material, which is available to authorized users.
Databáze: OpenAIRE