Simple and accurate transcriptional start site identification using Smar2C2 and examination of conserved promoter features.

Autor: Murray A; Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA., Mendieta JP; Department of Genetics, University of Georgia, Athens, GA, 30602, USA., Vollmers C; Deparment of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, 95064, USA., Schmitz RJ; Department of Genetics, University of Georgia, Athens, GA, 30602, USA.
Jazyk: angličtina
Zdroj: The Plant journal : for cell and molecular biology [Plant J] 2022 Oct; Vol. 112 (2), pp. 583-596. Date of Electronic Publication: 2022 Oct 02.
DOI: 10.1111/tpj.15957
Abstrakt: The precise and accurate identification and quantification of transcriptional start sites (TSSs) is key to understanding the control of transcription. The core promoter consists of the TSS and proximal non-coding sequences, which are critical in transcriptional regulation. Therefore, the accurate identification of TSSs is important for understanding the molecular regulation of transcription. Existing protocols for TSS identification are challenging and expensive, leaving high-quality data available for a small subset of organisms. This sparsity of data impairs study of TSS usage across tissues or in an evolutionary context. To address these shortcomings, we developed Smart-Seq2 Rolling Circle to Concatemeric Consensus (Smar2C2), which identifies and quantifies TSSs and transcription termination sites. Smar2C2 incorporates unique molecular identifiers that allowed for the identification of as many as 70 million sites, with no known upper limit. We have also generated TSS data sets from as little as 40 pg of total RNA, which was the smallest input tested. In this study, we used Smar2C2 to identify TSSs in Glycine max (soybean), Oryza sativa (rice), Sorghum bicolor (sorghum), Triticum aestivum (wheat) and Zea mays (maize) across multiple tissues. This wide panel of plant TSSs facilitated the identification of evolutionarily conserved features, such as novel patterns in the dinucleotides that compose the initiator element (Inr), that correlated with promoter expression levels across all species examined. We also discovered sequence variations in known promoter motifs that are positioned reliably close to the TSS, such as differences in the TATA box and in the Inr that may prove significant to our understanding and control of transcription initiation. Smar2C2 allows for the easy study of these critical sequences, providing a tool to facilitate discovery.
(© 2022 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.)
Databáze: MEDLINE