Annotation and query of tissue microarray data using the NCI Thesaurus

Autor: Nigam H. Shah, Daniel L. Rubin, Inigo Espinosa, Kelli Montgomery, Mark A. Musen
Jazyk: angličtina
Rok vydání: 2007
Předmět:
Databases
Factual

Computer science
Gene Expression Array
Information Storage and Retrieval
Text annotation
Sample (statistics)
Documentation
Ontology (information science)
lcsh:Computer applications to medicine. Medical informatics
Biochemistry
03 medical and health sciences
Annotation
0302 clinical medicine
Structural Biology
Neoplasms
lcsh:QH301-705.5
Molecular Biology
030304 developmental biology
0303 health sciences
Biological data
Information retrieval
Tissue microarray
National Library of Medicine (U.S.)
Applied Mathematics
Gene Expression Profiling
NCI Thesaurus
United States
Computer Science Applications
Neoplasm Proteins
lcsh:Biology (General)
Vocabulary
Controlled

Tissue Array Analysis
030220 oncology & carcinogenesis
lcsh:R858-859.7
Database Management Systems
DNA microarray
Software
Zdroj: BMC Bioinformatics
BMC Bioinformatics, Vol 8, Iss 1, p 296 (2007)
ISSN: 1471-2105
Popis: Background The Stanford Tissue Microarray Database (TMAD) is a repository of data serving a consortium of pathologists and biomedical researchers. The tissue samples in TMAD are annotated with multiple free-text fields, specifying the pathological diagnoses for each sample. These text annotations are not structured according to any ontology, making future integration of this resource with other biological and clinical data difficult. Results We developed methods to map these annotations to the NCI thesaurus. Using the NCI-T we can effectively represent annotations for about 86% of the samples. We demonstrate how this mapping enables ontology driven integration and querying of tissue microarray data. We have deployed the mapping and ontology driven querying tools at the TMAD site for general use. Conclusion We have demonstrated that we can effectively map the diagnosis-related terms describing a sample in TMAD to the NCI-T. The NCI thesaurus terms have a wide coverage and provide terms for about 86% of the samples. In our opinion the NCI thesaurus can facilitate integration of this resource with other biological data.
Databáze: OpenAIRE