Some statistical properties of gene expression clustering for array data

Autor:	Abreu, G C G, Pinheiro, A, Drummond, R D, Camargo, S R, Menossi, M
Jazyk:	angličtina
Rok vydání:	2010
Předmět:	genomics bootstrap resampling array data
Zdroj:	Abreu, G C G, Pinheiro, A, Drummond, R D, Camargo, S R & Menossi, M 2010, ' Some statistical properties of gene expression clustering for array data ', Advances and Applications in Statistics, vol. 14, no. 2, pp. 191-204 .
Popis:	DNA arrays have been a rich source of data for the study of genomic expression of a wide variety of biological systems. Gene clustering is one of the paradigms quite used to assess the significance of a gene (or group of genes). However, most of the gene clustering techniques are applied to cDNA array data without a corresponding statistical error measure. We propose an easy-to-implement and simple-to-use technique that uses bootstrap re-sampling to evaluate the statistical error of the nodes provided by SOM-based clustering. Comparisons between SOM and parametric clustering are presented for simulated as well as for two real data sets. We also implement a bootstrap-based pre-processing procedure for SOM, that improves the false discovery ratio of differentially expressed genes. Code in Matlab is freely available, as well as some supplementary material, at the following address: https://ipe.cbmeg.unicamp.br/pub/abreu.gcg. Code implementation in R is in progress. Udgivelsesdato: February DNA arrays have been a rich source of data for the study of genomic expression of a wide variety of biological systems. Gene clustering is one of the paradigms quite used to assess the significance of a gene (or group of genes). However, most of the gene clustering techniques are applied to cDNA array data without a corresponding statistical error measure. We propose an easy-to-implement and simple-to-use technique that uses bootstrap re-sampling to evaluate the statistical error of the nodes provided by SOM-based clustering. Comparisons between SOM and parametric clustering are presented for simulated as well as for two real data sets. We also implement a bootstrap-based pre-processing procedure for SOM, that improves the false discovery ratio of differentially expressed genes. Code in Matlab is freely available, as well as some supplementary material, at the following address: https://ipe.cbmeg.unicamp.br/pub/abreu.gcg. Code implementation in R is in progress.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=pure_au_____::b9b547112fa646c93beb65f9089fd8e2 https://pure.au.dk/portal/da/publications/some-statistical-properties-of-gene-expression-clustering-for-array-data(4f88ba30-bbf0-11df-80f4-000ea68e967b).html Zobrazit plný text záznamu