A Computational Framework for Identifying Promoter Sequences in Nonmodel Organisms Using RNA-seq Data Sets
Autor: | Stephanie M. Ford, David A. C. Beck, Erin H. Wilson, Mary E. Lidstrom, Joseph Groom, M Claire Sarfatis |
---|---|
Rok vydání: | 2021 |
Předmět: |
0106 biological sciences
Biomedical Engineering Sigma Factor RNA-Seq Computational biology Biology 01 natural sciences Biochemistry Genetics and Molecular Biology (miscellaneous) Genome Transcription initiation Metabolic engineering 03 medical and health sciences Synthetic biology 010608 biotechnology Escherichia coli Promoter Regions Genetic Gene Transcription Initiation Genetic 030304 developmental biology 0303 health sciences Reporter gene Base Sequence Computational Biology Promoter DNA-Directed RNA Polymerases General Medicine RNA Bacterial Metabolic Engineering Methylococcaceae Transcription Initiation Site Transcriptome Genome Bacterial |
Zdroj: | ACS Synthetic Biology. 10:1394-1405 |
ISSN: | 2161-5063 |
DOI: | 10.1021/acssynbio.1c00017 |
Popis: | Engineering microorganisms into biological factories that convert renewable feedstocks into valuable materials is a major goal of synthetic biology; however, for many nonmodel organisms, we do not yet have the genetic tools, such as suites of strong promoters, necessary to effectively engineer them. In this work, we developed a computational framework that can leverage standard RNA-seq data sets to identify sets of constitutive, strongly expressed genes and predict strong promoter signals within their upstream regions. The framework was applied to a diverse collection of RNA-seq data measured for the methanotroph Methylotuvimicrobium buryatense 5GB1 and identified 25 genes that were constitutively, strongly expressed across 12 experimental conditions. For each gene, the framework predicted short (27-30 nucleotide) sequences as candidate promoters and derived -35 and -10 consensus promoter motifs (TTGACA and TATAAT, respectively) for strong expression in M. buryatense. This consensus closely matches the canonical E. coli sigma-70 motif and was found to be enriched in promoter regions of the genome. A subset of promoter predictions was experimentally validated in a XylE reporter assay, including the consensus promoter, which showed high expression. The pmoC, pqqA, and ssrA promoter predictions were additionally screened in an experiment that scrambled the -35 and -10 signal sequences, confirming that transcription initiation was disrupted when these specific regions of the predicted sequence were altered. These results indicate that the computational framework can make biologically meaningful promoter predictions and identify key pieces of regulatory systems that can serve as foundational tools for engineering diverse microorganisms for biomolecule production. |
Databáze: | OpenAIRE |
Externí odkaz: |