Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Data

Autor:	Mark J. van der Laan, Merrill D. Birkner, Christine M. Hegedus, Christine F. Skibola, Alan Hubbard, Martyn T. Smith
Rok vydání:	2006
Předmět:	Proteomics Statistics and Probability Childhood leukemia Bone Marrow Cells Computational biology Text mining Statistics Genetics False positive paradox medicine Humans Child Molecular Biology Probability Mathematics business.industry Confounding Myeloid leukemia Precursor Cell Lymphoblastic Leukemia-Lymphoma medicine.disease Regression Neoplasm Proteins Computational Mathematics Leukemia Leukemia Myeloid Data Interpretation Statistical Spectrometry Mass Matrix-Assisted Laser Desorption-Ionization Acute Disease Multiple comparisons problem business Algorithms
Zdroj:	Statistical Applications in Genetics and Molecular Biology. 5
ISSN:	1544-6115
DOI:	10.2202/1544-6115.1198
Popis:	A new data filtering method for SELDI-TOF MS proteomic spectra data is described. We examined technical repeats (2 per subject) of intensity versus m/z (mass/charge) of bone marrow cell lysate for two groups of childhood leukemia patients: acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). As others have noted, the type of data processing as well as experimental variability can have a disproportionate impact on the list of "interesting'' proteins (see Baggerly et al. (2004)). We propose a list of processing and multiple testing techniques to correct for 1) background drift; 2) filtering using smooth regression and cross-validated bandwidth selection; 3) peak finding; and 4) methods to correct for multiple testing (van der Laan et al. (2005)). The result is a list of proteins (indexed by m/z) where average expression is significantly different among disease (or treatment, etc.) groups. The procedures are intended to provide a sensible and statistically driven algorithm, which we argue provides a list of proteins that have a significant difference in expression. Given no sources of unmeasured bias (such as confounding of experimental conditions with disease status), proteins found to be statistically significant using this technique have a low probability of being false positives.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c63988afe2468a0d52eec766e333d194 https://doi.org/10.2202/1544-6115.1198 Zobrazit plný text záznamu