Autor: |
Pan Y; Institute for Information Technology, National Research Council Canada, 1200 Montreal Road, Bldg M-50, Ottawa, Ontario, Canada K1A 0R6, Canada. Youlian.Pan@nrc.ca, Pylatuik JD, Ouyang J, Famili AF, Fobert PR |
Jazyk: |
angličtina |
Zdroj: |
Journal of bioinformatics and computational biology [J Bioinform Comput Biol] 2004 Dec; Vol. 2 (4), pp. 639-55. |
DOI: |
10.1142/s0219720004000776 |
Abstrakt: |
Various data mining techniques combined with sequence motif information in the promoter region of genes were applied to discover functional genes that are involved in the defense mechanism of systemic acquired resistance (SAR) in Arabidopsis thaliana. A series of K-Means clustering with difference-in-shape as distance measure was initially applied. A stability measure was used to validate this clustering process. A decision tree algorithm with the discover-and-mask technique was used to identify a group of most informative genes. Appearance and abundance of various transcription factor binding sites in the promoter region of the genes were studied. Through the combination of these techniques, we were able to identify 24 candidate genes involved in the SAR defense mechanism. The candidate genes fell into 2 highly resolved categories, each category showing significantly unique profiles of regulatory elements in their promoter regions. This study demonstrates the strength of such integration methods and suggests a broader application of this approach. |
Databáze: |
MEDLINE |
Externí odkaz: |
|