Additional file 2 of Comparative analysis of morabine grasshopper genomes reveals highly abundant transposable elements and rapidly proliferating satellite DNA repeats

Autor: Palacios-Gimenez, Octavio M., Koelman, Julia, Palmada-Flores, Marc, Bradford, Tessa M., Jones, Karl K., Cooper, Steven J. B., Kawakami, Takeshi, Suh, Alexander
Rok vydání: 2020
DOI: 10.6084/m9.figshare.13474236.v1
Popis: Additional file 2: Figure S1. Comparison of the k-mer distributions (18-mer frequency) in the genomic reads of four chromosomal races of the viatica species group. a) P24X0. b) P24XY. c) P45bX0. d) P45bXY. The numbers of k-mers (18-mer frequencies) were plotted against the 18-mer coverage. The peak coverage (k-mer cov) corresponding to heterozygous k-mers (heterozygosity) as well as the repetitive content are indicated in the plot legends. Figure S2. Comparison of transposable element landscapes in male genome assemblies of four chromosomal races of the viatica species group. a-h) Percentage of bp occupied in the genome (y axis) plotted against the Kimura 2-parameter (transitions/transversions) distance (x axis) of copies from each TE superfamily (color-coded) from their consensus sequences. a,c,e,g) Based on de-novo predicted repeats from RepeatModeler (RML) and Arthropoda Repbase library (ARL) repeats. b,d,f,h) Based on curated de-novo predicted repeats (curation of the P24X0 library used for re-classification of the three other de-novo libraries) from RepeatModeler + ARL repeats. Note the share of unknown (gray) repeats, a majority of which were identified as LTR retrotransposons (green) and DNA transposons (orange/red) when manually curated. Figure S3. Tandem repeat (TR) landscapes in the male sequenced reads of four chromosomal races of the viatica species group. a) P24X0. b) P24XY. c) P45bX0 d) P45bXY. Temporal accumulation of TRs is shown as repeat element divergence in Kimura-2 parameter (K2P) distance to consensus on the x axis and the TR abundance in on the y axis. The satDNAs are named as “Vv” (for Vandiemenella viatica group) followed by “P” (for provisional taxon) and a number that indicates the family number in decreasing order of the genomic read proportion of the race. Figure S4. All-against-all dotplot comparisons showing the diversity of satDNAs arrays detected among four chromosomal races of the viatica species group. a) The VvP24X0-6 (28 bp) satDNA in the P24X0 race. b) The VvP24XY-6 (167 bp) satDNA in the P24XY race. c) The VvP45bX0-50 (171 bp) satDNA in the P45bX0 race. d) The VvP45bXY-11 (52 bp) satDNA in the P45bXY race. The satDNAs are named as “Vv” (for Vandiemenella viatica group) followed by “P” (for provisional taxon) and a number that indicates the family number in decreasing order of the genomic read proportion of the race. Each satDNA family was defined by graph-based clustering of sequenced reads in RepeatExplorer2 (see Materials and Methods section). For simplicity, the plot shows all-against-all comparisons of the first six contigs (C1-C6) within the cluster. Contigs are aligned against themselves, against one another, and against their monomer consensus sequence (Cons). The different gray/black shades enable the identification of long shared subsequences between contigs at a glance, based on longest common subsequence, or longest match if mismatches are considered. Longer matches are represented by darker background shading.
Databáze: OpenAIRE