Including transcription factor information in the superparamagnetic clustering of microarray data

Autor: Monsivais-Alonso, M. P., Navarro-Munoz, J. C., Riego-Ruiz, L., Lopez-Sandoval, R., Rosu, H. C.
Rok vydání: 2010
Předmět:
Zdroj: Physica A 389(24), 5689-5697 (2010)
Druh dokumentu: Working Paper
DOI: 10.1016/j.physa.2010.09.006
Popis: In this work, we modify the superparamagnetic clustering algorithm (SPC) by adding an extra weight to the interaction formula that considers which genes are regulated by the same transcription factor. With this modified algorithm that we call SPCTF, we analyze Spellman et al. microarray data for cell cycle genes in yeast, and find clusters with a higher number of elements compared with those obtained with the SPC algorithm. Some of the incorporated genes by using SPCFT were not detected at first by Spellman et al. but were later identified by other studies, whereas several genes still remain unclassified. The clusters composed by unidentified genes were analyzed with MUSA, the motif finding using an unsupervised approach algorithm, and this allow us to select the clusters whose elements contain cell cycle transcription factor binding sites as clusters worth of further experimental studies because they would probably lead to new cell cycle genes. Finally, our idea of introducing available information about transcription factors to optimize the gene classification could be implemented for other distance-based clustering algorithms
Comment: 16 pages, 6 Figures made of a total of 11 figures, 2 tables, supplementary info file containing a list of the 27 most stable clusters of 6 or more genes of the cell cycle, in this approach, with the codes of their component genes in synchronized Sc yeast cultures, available from the authors, or from the Physica A site
Databáze: arXiv