Leveraging histone modifications to improve genome annotations.

Autor: Mendieta JP; Department of Genetics, University of Georgia, Athens, GA 30602, USA., Marand AP; Department of Genetics, University of Georgia, Athens, GA 30602, USA., Ricci WA; Department of Plant Biology, University of Georgia, Athens, GA 30602, USA., Zhang X; Department of Genetics, University of Georgia, Athens, GA 30602, USA., Schmitz RJ; Department of Genetics, University of Georgia, Athens, GA 30602, USA.
Jazyk: angličtina
Zdroj: G3 (Bethesda, Md.) [G3 (Bethesda)] 2021 Sep 27; Vol. 11 (10).
DOI: 10.1093/g3journal/jkab263
Abstrakt: Accurate genome annotations are essential to modern biology; however, they remain challenging to produce. Variation in gene structure and expression across species, as well as within an organism, make correctly annotating genes arduous; an issue exacerbated by pitfalls in current in silico methods. These issues necessitate complementary approaches to add additional confidence and rectify potential misannotations. Integration of epigenomic data into genome annotation is one such approach. In this study, we utilized sets of histone modification data, which are precisely distributed at either gene bodies or promoters to evaluate the annotation of the Zea mays genome. We leveraged these data genome wide, allowing for identification of annotations discordant with empirical data. In total, 13,159 annotation discrepancies were found in Z. mays upon integrating data across three different tissues, which were corroborated using RNA-based approaches. Upon correction, genes were extended by an average of 2128 base pairs, and we identified 2529 novel genes. Application of this method to five additional plant genomes identified a series of misannotations, as well as identified novel genes, including 13,836 in Asparagus officinalis, 2724 in Setaria viridis, 2446 in Sorghum bicolor, 8631 in Glycine max, and 2585 in Phaseolous vulgaris. This study demonstrates that histone modification data can be leveraged to rapidly improve current genome annotations across diverse plant lineages.
(© The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America.)
Databáze: MEDLINE