GnpAnnot community annotation system: features, qualifiers, values

Autor: Sidibe-Bocs, Stéphanie, Legeai, Fabrice, Droc, Gaëtan, Rouard, Mathieu, Alaux, Michael, Leroy, P., Fournier, Philippe, Terrier, Nancy, Baurens, Franc-Christophe, Garsmeur, Olivier, Poiron, Claire, Guignon, V., Simon, A., Hoede, Claire, Steinbach, Delphine, Lebrun, Marc-Henri, Tagu, Denis, Quesneville, Hadi, Amselem, Joelle
Přispěvatelé: Développement et amélioration des plantes (UMR DAP), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-Institut National de la Recherche Agronomique (INRA)-Université Montpellier 2 - Sciences et Techniques (UM2)-Centre National de la Recherche Scientifique (CNRS), Biologie des organismes et des populations appliquées à la protection des plantes (BIO3P), Institut National de la Recherche Agronomique (INRA)-Université de Rennes (UR)-AGROCAMPUS OUEST, Bioversity International [Montpellier], Bioversity International [Rome], Consultative Group on International Agricultural Research [CGIAR] (CGIAR)-Consultative Group on International Agricultural Research [CGIAR] (CGIAR), Unité de Recherche Génomique Info (URGI), Institut National de la Recherche Agronomique (INRA), Biologie Intégrative et Virologie des Insectes [Univ. de Montpellier II] (BIVI), Institut National de la Recherche Agronomique (INRA)-Université Montpellier 2 - Sciences et Techniques (UM2), Sciences Pour l'Oenologie (SPO), Université Montpellier 1 (UM1)-Institut National de la Recherche Agronomique (INRA)-Université de Montpellier (UM)-Institut national d’études supérieures agronomiques de Montpellier (Montpellier SupAgro), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad), BIOlogie et GEstion des Risques en agriculture (BIOGER), Institut National de la Recherche Agronomique (INRA)-AgroParisTech, Institut National de la Recherche Agronomique (INRA)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-AGROCAMPUS OUEST, Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro), Institut national d’études supérieures agronomiques de Montpellier (Montpellier SupAgro), Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Université Montpellier 1 (UM1)-Université de Montpellier (UM)-Institut National de la Recherche Agronomique (INRA), AgroParisTech-Institut National de la Recherche Agronomique (INRA), ProdInra, Migration
Jazyk: angličtina
Rok vydání: 2009
Předmět:
Zdroj: 3. International Biocuration Conference
3. International Biocuration Conference, Apr 2009, Berlin, Germany
3rd International Biocuration Conference, 16-19 March 2009, Berlin, Germany
Popis: Correspondance: stephanie.sidibe-bocs@cirad.fr; International audience; In January 2009, 991 complete genomes have been already published and 3376 genome sequencing projects are ongoing, leading to an explosion of data that needs to be stored, curated and analyzed. GnpAnnot is a project on green genomics which intends to develop a system of structural and functional annotation supported by comparative genomics and dedicated to plant and bio-aggressor genomes allowing both automatic predictions and manual curations of genomic objects. The core of GnpAnnot is a community annotation system (CAS) based on GMOD components: Chado / GBrowse / Apollo / Artemis. The system should also enable to browse comparative genomics results, to build queries and to export sets of gene lists and gene reports in various formats. The system should allow the annotation reconciliation, history, integrity, consistency and update and the management of public and private projects. To facilitate the work of the curators, four steps are crucial: 1. To provide homogeneous features, qualifiers and values for genomic objects; 2. To share a strong CAS: run high quality combiners / pipelines to predict automatically genomic objects which are stored in a relational database management system and then available from graphical and textual fast browsers and powerful editors; 3. To define annotation rules, train the annotators and organize annotation jamborees; 4. To submit the results in public sequence knowledge bases in an easy way. In this work we focus on the first and third steps. A mapping between different known sources: sequence ontology, DDBJ / EMBL / GenBank feature definition, GFF3, Chado, gene nomenclatures, transposable element classification and annotation guidelines from various genome project consortia is described. Homogeneous feature keys, qualifiers and value format with a maximum of controlled vocabularies for genes and transposable elements are proposed. Rules to annotate, in a coherent way, the structure and the function of genes and the structure and the classification of transposable elements are proposed. These rules could be useful both for automatic predictions and manual curation. Examples of annotations on a BAC sequence of a monocot are presented.
Databáze: OpenAIRE