Reference point insensitive molecular data analysis

Autor: Michael Altenbuchinger, Daniela Weber, Frank Stämmler, Katja Dettmer, Thorsten Rehberg, Peter J. Oefner, Rainer Spang, Ernst Holler, Andreas Hiergeist, André Gessner, Helena U. Zacharias
Rok vydání: 2016
Předmět:
0301 basic medicine
Statistics and Probability
Computer science
610 Medizin
01 natural sciences
Biochemistry
Measure (mathematics)
010104 statistics & probability
03 medical and health sciences
Metabolomics
Lasso (statistics)
Statistics
Humans
Computer Simulation
0101 mathematics
Coordinate descent
Molecular Biology
VERSUS-HOST-DISEASE
STEM-CELL TRANSPLANTATION
LOGISTIC-REGRESSION
VARIABLE SELECTION
C-MYC
REGULARIZATION
MICROBIOME
LASSO
Biomedicine
ddc:610
Bacteria
business.industry
Computational Biology
Regression analysis
Observable
Pattern recognition
Gene Expression Regulation
Bacterial

Regression
Computer Science Applications
Gastrointestinal Microbiome
Computational Mathematics
030104 developmental biology
Computational Theory and Mathematics
Artificial intelligence
business
Algorithms
Software
Zdroj: Bioinformatics (Oxford, England). 33(2)
ISSN: 1367-4811
Popis: Motivation In biomedicine, every molecular measurement is relative to a reference point, like a fixed aliquot of RNA extracted from a tissue, a defined number of blood cells, or a defined volume of biofluid. Reference points are often chosen for practical reasons. For example, we might want to assess the metabolome of a diseased organ but can only measure metabolites in blood or urine. In this case, the observable data only indirectly reflects the disease state. The statistical implications of these discrepancies in reference points have not yet been discussed. Results Here, we show that reference point discrepancies compromise the performance of regression models like the LASSO. As an alternative, we suggest zero-sum regression for a reference point insensitive analysis. We show that zero-sum regression is superior to the LASSO in case of a poor choice of reference point both in simulations and in an application that integrates intestinal microbiome analysis with metabolomics. Moreover, we describe a novel coordinate descent based algorithm to fit zero-sum elastic nets. Availability and Implementation The R-package “zeroSum” can be downloaded at https://github.com/rehbergT/zeroSum. Moreover, we provide all R-scripts and data used to produce the results of this manuscript as Supplementary Material. Supplementary information Supplementary material is available at Bioinformatics online.
Databáze: OpenAIRE