Autor: |
Turudić, Ante, Liber, Zlatko, Grdiša, Martina, Jakše, Jernej, Varga, Filip, Šatović, Zlatko |
Přispěvatelé: |
Goreta Ban, Smiljana, Šatović, Zlatko |
Jazyk: |
angličtina |
Rok vydání: |
2022 |
Předmět: |
|
Popis: |
The number of available DNA sequences in public genetic databases is constantly increasing. In plants, this is particularly evident in the amount of available complete chloroplast genomes, which are widely used in phylogenetic research. Chloroplast DNA genomes are circular and most have a four-part structure caused by two copies of a large inverted repeat (IR). We investigated inconsistencies in publicly available chloroplast genome sequence data regarding how stored public data account for structure. Our results show that there is no standardization in the storage of chloroplast genome sequences with respect to the structure of inverted repeats, as sequences are stored in different orders. Furthermore, there are many sequences in the public data without annotated inverted repeats, although these repeats are expected. In reviewing specialized chloroplast annotation tools, we found that there is no uniform method for identifying inverted repeats. Each tool analyzed takes a different approach and covers different specific situations. These results show that there is a need to standardize formats when it comes to storing data of specific types such as chloroplast sequences. Our results suggest that the existing public chloroplast data should be revised in terms of standard storage format and missing data. In addition to stored data, we found that specialized chloroplast annotation tools need improvement regarding the detection of inverted repeats. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|