Accounting for noise when clustering biological data
Autor: | Roman Sloutsky, Nicolas D. Jimenez, Kristen M. Naegle, S. Joshua Swamidass |
---|---|
Rok vydání: | 2012 |
Předmět: |
Biological data
Computer science business.industry Gene Expression Proteins Accounting computer.software_genre Expression (mathematics) Data set Range (mathematics) Noise Unsupervised learning Cluster Analysis Sensitivity (control systems) Data mining Phosphorylation Cluster analysis business Transcriptome Molecular Biology computer Algorithms Information Systems |
Zdroj: | Briefings in bioinformatics. 14(4) |
ISSN: | 1477-4054 |
Popis: | Clustering is a powerful and commonly used technique that organizes and elucidates the structure of biological data. Clustering data from gene expression, metabolomics and proteomics experiments has proven to be useful at deriving a variety of insights, such as the shared regulation or function of biochemical components within networks. However, experimental measurements of biological processes are subject to substantial noise-stemming from both technical and biological variability-and most clustering algorithms are sensitive to this noise. In this article, we explore several methods of accounting for noise when analyzing biological data sets through clustering. Using a toy data set and two different case studies-gene expression and protein phosphorylation-we demonstrate the sensitivity of clustering algorithms to noise. Several methods of accounting for this noise can be used to establish when clustering results can be trusted. These methods span a range of assumptions about the statistical properties of the noise and can therefore be applied to virtually any biological data source. |
Databáze: | OpenAIRE |
Externí odkaz: |