Phenoscape: Semantic analysis of organismal traits and genes yields insights in evolutionary biology
Autor: | Hilmar Lapp, Wasila M. Dahdul, Paula M. Mabee, Monte Westerfield, Josef C. Uyeda, Prashanti Manda, James P. Balhoff, Todd Vision |
---|---|
Přispěvatelé: | Biological Sciences |
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
0106 biological sciences
Cognitive science 0303 health sciences Government Modern evolutionary synthesis Interoperability 010603 evolutionary biology 01 natural sciences 03 medical and health sciences ComputingMethodologies_PATTERNRECOGNITION Semantic similarity Phenotype ontology Coordination network Semantic analysis (knowledge representation) Machine reasoning Sociology 030304 developmental biology |
Zdroj: | Application of Semantic Technology in Biodiversity Science |
DOI: | 10.7287/peerj.preprints.26988v1 |
Popis: | The study of how the observable features of organisms, i.e., their phenotypes, result from the complex interplay between genetics, development, and the environment, is central to much research in biology. The varied language used in the description of phenotypes, however, impedes the large scale and interdisciplinary analysis of phenotypes by computational methods. The Phenoscape project (www.phenoscape.org) has developed semantic annotation tools and a gene–phenotype knowledgebase, the Phenoscape KB, that uses machine reasoning to connect evolutionary phenotypes from the comparative literature to mutant phenotypes from model organisms. The semantically annotated data enables the linking of novel species phenotypes with candidate genes that may underlie them. Semantic annotation of evolutionary phenotypes further enables previously difficult or novel analyses of comparative anatomy and evolution. These include generating large, synthetic character matrices of presence/absence phenotypes based on inference, and searching for taxa and genes with similar variation profiles using semantic similarity. Phenoscape is further extending these tools to enable users to automatically generate synthetic supermatrices for diverse character types, and use the domain knowledge encoded in ontologies for evolutionary trait analysis. Curating the annotated phenotypes necessary for this research requires significant human curator effort, although semi-automated natural language processing tools promise to expedite the curation of free text. As semantic tools and methods are developed for the biodiversity sciences, new insights from the increasingly connected stores of interoperable phenotypic and genetic data are anticipated. We thank all participants in the Phenotype Ontology Research Coordination Network (RCN) (NSF 0956049) for their vision, contributions, and commitment to developing shared and interoperable resources. During the course of this work the Phenoscape project has been supported by NSF awards 1062404, 1062542, 0641025, 1661529, and the National Evolutionary Synthesis Center (NSF 0905606 and 0423641). This manuscript is based in part on work done by P.M.M. while serving at the U.S. National Science Foundation. The views expressed in this paper do not necessarily reflect those of the National Science Foundation or the United States 314 Government. |
Databáze: | OpenAIRE |
Externí odkaz: |