Polyadenylation-related isoform switching in human evolution revealed by full-length transcript structure.

Autor: Li Y; Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China., Shen QS; Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China., Peng Q; Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.; College of Future Technology, Peking University, Beijing, China., Ding W; Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.; College of Future Technology, Peking University, Beijing, China., Zhang J; Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.; College of Future Technology, Peking University, Beijing, China., Zhong X; Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China., An NA; Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.; College of Future Technology, Peking University, Beijing, China., Ji M; Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.; College of Future Technology, Peking University, Beijing, China., Zhou WZ; State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, Beijing, China., Li CY; Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China.; College of Future Technology, Peking University, Beijing, China.
Jazyk: angličtina
Zdroj: Briefings in bioinformatics [Brief Bioinform] 2021 Nov 05; Vol. 22 (6).
DOI: 10.1093/bib/bbab157
Abstrakt: Rhesus macaque is a unique nonhuman primate model for human evolutionary and translational study, but the error-prone gene models critically limit its applications. Here, we de novo defined full-length macaque gene models based on single molecule, long-read transcriptome sequencing in four macaque tissues (frontal cortex, cerebellum, heart and testis). Overall, 8 588 227 poly(A)-bearing complementary DNA reads with a mean length of 14 106 nt were generated to compile the backbone of macaque transcripts, with the fine-scale structures further refined by RNA sequencing and cap analysis gene expression sequencing data. In total, 51 605 macaque gene models were accurately defined, covering 89.7% of macaque or 75.7% of human orthologous genes. Based on the full-length gene models, we performed a human-macaque comparative analysis on polyadenylation (PA) regulation. Using macaque and mouse as outgroup species, we identified 79 distal PA events newly originated in humans and found that the strengthening of the distal PA sites, rather than the weakening of the proximal sites, predominantly contributes to the origination of these human-specific isoforms. Notably, these isoforms are selectively constrained in general and contribute to the temporospatially specific reduction of gene expression, through the tinkering of previously existed mechanisms of nuclear retention and microRNA (miRNA) regulation. Overall, the protocol and resource highlight the application of bioinformatics in integrating multilayer genomics data to provide an intact reference for model animal studies, and the isoform switching detected may constitute a hitherto underestimated regulatory layer in shaping the human-specific transcriptome and phenotypic changes.
(© The Author(s) 2021. Published by Oxford University Press.)
Databáze: MEDLINE
Nepřihlášeným uživatelům se plný text nezobrazuje