Popis: |
With the advent of biological ontologies an increasing amount of methods are emerging for enriching gene information by means of their annotations. However, problems occur in assessing semantic similarity over genetic aspects that are represented independently in different schemas when, in reality, they are not. This paper presents a framework that integrates heterogeneous knowledge from different resources (i.e. ontologies, texts, expert classifications) for capturing information about how a set of genes work together in targeting a biological process. Our approach grounds on the ontological annotation of gene summaries. Given the analogy between these annotations and the representation of documents in information retrieval, we apply techniques used in text mining to evaluate the semantic similarity of summaries within a gene set. To determine if our framework makes sense in a biological context, we conducted experiments on popular gene sets and compared results with what asserted by domain experts. Our approach provides an empirical basis for capturing complementary information about how genes interact and could be used in conjunction with other similarity methods or bioinformatic tools. |