Annotation and query of tissue microarray data using the NCI Thesaurus
Autor: | Nigam H. Shah, Daniel L. Rubin, Inigo Espinosa, Kelli Montgomery, Mark A. Musen |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2007 |
Předmět: |
Databases
Factual Computer science Gene Expression Array Information Storage and Retrieval Text annotation Sample (statistics) Documentation Ontology (information science) lcsh:Computer applications to medicine. Medical informatics Biochemistry 03 medical and health sciences Annotation 0302 clinical medicine Structural Biology Neoplasms lcsh:QH301-705.5 Molecular Biology 030304 developmental biology 0303 health sciences Biological data Information retrieval Tissue microarray National Library of Medicine (U.S.) Applied Mathematics Gene Expression Profiling NCI Thesaurus United States Computer Science Applications Neoplasm Proteins lcsh:Biology (General) Vocabulary Controlled Tissue Array Analysis 030220 oncology & carcinogenesis lcsh:R858-859.7 Database Management Systems DNA microarray Software |
Zdroj: | BMC Bioinformatics BMC Bioinformatics, Vol 8, Iss 1, p 296 (2007) |
ISSN: | 1471-2105 |
Popis: | Background The Stanford Tissue Microarray Database (TMAD) is a repository of data serving a consortium of pathologists and biomedical researchers. The tissue samples in TMAD are annotated with multiple free-text fields, specifying the pathological diagnoses for each sample. These text annotations are not structured according to any ontology, making future integration of this resource with other biological and clinical data difficult. Results We developed methods to map these annotations to the NCI thesaurus. Using the NCI-T we can effectively represent annotations for about 86% of the samples. We demonstrate how this mapping enables ontology driven integration and querying of tissue microarray data. We have deployed the mapping and ontology driven querying tools at the TMAD site for general use. Conclusion We have demonstrated that we can effectively map the diagnosis-related terms describing a sample in TMAD to the NCI-T. The NCI thesaurus terms have a wide coverage and provide terms for about 86% of the samples. In our opinion the NCI thesaurus can facilitate integration of this resource with other biological data. |
Databáze: | OpenAIRE |
Externí odkaz: |