PrediTALE: A novel model learned from quantitative data allows for new perspectives on TALE targeting

Autor: Maik Reschke, Jan Grau, Annett Erkes, Jens Boch, Stefanie Mücke
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Dewey Decimal Classification::500 | Naturwissenschaften::570 | Biowissenschaften
Biologie

Gene Expression
Biochemistry
Transcription (biology)
Biology (General)
Plant-pathogenic
biology
Virulence
Effector
Xanthomonas bacteria
Eukaryota
Genomics
Plants
Experimental Organism Systems
Tandem Repeat Sequences
Transcription Initiation Site
Genome
Plant

Research Article
Xanthomonas
QH301-705.5
DNA transcription
TALEs
Computational biology
Genes
Plant

Research and Analysis Methods
Models
Biological

DNA sequencing
Tandem repeat
Plant and Algal Models
ddc:570
DNA-binding proteins
Genetics
Gene Regulation
Grasses
Gene Prediction
Gene
Transcription Activator-Like Effectors
Plant Diseases
Host Microbial Interactions
Bacteria
Organisms
Computational Biology
Biology and Life Sciences
Proteins
Oryza
Gene Annotation
biology.organism_classification
Genome Analysis
Regulatory Proteins
Animal Studies
Rice
Transcription Factors
Zdroj: PLoS Computational Biology 15 (2019), Nr. 7
PLoS Computational Biology
PLoS Computational Biology, Vol 15, Iss 7, p e1007206 (2019)
Popis: Plant-pathogenic Xanthomonas bacteria secrete transcription activator-like effectors (TALEs) into host cells, where they act as transcriptional activators on plant target genes to support bacterial virulence. TALEs have a unique modular DNA-binding domain composed of tandem repeats. Two amino acids within each tandem repeat, termed repeat-variable diresidues, bind to contiguous nucleotides on the DNA sequence and determine target specificity. In this paper, we propose a novel approach for TALE target prediction to identify potential virulence targets. Our approach accounts for recent findings concerning TALE targeting, including frame-shift binding by repeats of aberrant lengths, and the flexible strand orientation of target boxes relative to the transcription start of the downstream target gene. The computational model can account for dependencies between adjacent RVD positions. Model parameters are learned from the wealth of quantitative data that have been generated over the last years. We benchmark the novel approach, termed PrediTALE, using RNA-seq data after Xanthomonas infection in rice, and find an overall improvement of prediction performance compared with previous approaches. Using PrediTALE, we are able to predict several novel putative virulence targets. However, we also observe that no target genes are predicted by any prediction tool for several TALEs, which we term orphan TALEs for this reason. We postulate that one explanation for orphan TALEs are incomplete gene annotations and, hence, propose to replace promoterome-wide by genome-wide scans for target boxes. We demonstrate that known targets from promoterome-wide scans may be recovered by genome-wide scans, whereas the latter, combined with RNA-seq data, are able to detect putative targets independent of existing gene annotations.
Author summary Diseases caused by plant-pathogenic Xanthomonas bacteria are a serious threat for many important crop plants including rice. Efficiently protecting plants from these pathogens requires a deeper understanding of infection strategies. For many Xanthomonas strains, such infection strategies depend on a special class of effector proteins, termed transcription activator-like effectors (TALEs). TALEs may specifically activate genes of the host plant and, by this means, re-program the plant cell for the benefit of the pathogen. Target sequences and, consequently, target genes of a specific TALE may be predicted computationally from its amino acids. Here, we propose a novel approach for TALE target prediction that makes use of several insights into TALE biology but also of broad experimental data gained over the last years. We demonstrate that this approach yields a higher prediction accuracy than previous approaches. We further postulate that a strategy change from a restricted search only considering promoters of annotated genes to a broad genome-wide search is feasible and yields novel targets including previously neglected protein-coding genes but also non-coding RNAs of possibly regulatory function.
Databáze: OpenAIRE