Efficient estimation of grouped survival models

Autor: Janice M. McCarthy, Tracy Truong, Jiaxing Lin, Yu Jiang, Deanna L. Kroetz, Kouros Owzar, Alexander B. Sibley, Andrew S. Allen, Zhiguo Li, Katherina C. Chua
Rok vydání: 2019
Předmět:
Computer science
Statistics as Topic
Efficient score
computer.software_genre
Biochemistry
Mathematical Sciences
0302 clinical medicine
Gene Frequency
Models
Genome-wide analysis
Structural Biology
Multiple testing
lcsh:QH301-705.5
Cancer
Likelihood Functions
0303 health sciences
Applied Mathematics
Grouped data
Biological Sciences
3. Good health
Computer Science Applications
Benchmarking
Phenotype
030220 oncology & carcinogenesis
lcsh:R858-859.7
Data mining
Bioinformatics
lcsh:Computer applications to medicine. Medical informatics
Heritability
03 medical and health sciences
Genetic
Information and Computing Sciences
Breast Cancer
Covariate
Genetics
Humans
Molecular Biology
Survival analysis
030304 developmental biology
Data collection
Models
Genetic

Human Genome
Score statistic
lcsh:Biology (General)
Discrete censoring
Multiple comparisons problem
Pharmacogenomics
computer
Software
Genome-Wide Association Study
Zdroj: BMC Bioinformatics, Vol 20, Iss 1, Pp 1-11 (2019)
BMC bioinformatics, vol 20, iss 1
BMC Bioinformatics
ISSN: 1471-2105
DOI: 10.1186/s12859-019-2899-x
Popis: Background Time- and dose-to-event phenotypes used in basic science and translational studies are commonly measured imprecisely or incompletely due to limitations of the experimental design or data collection schema. For example, drug-induced toxicities are not reported by the actual time or dose triggering the event, but rather are inferred from the cycle or dose to which the event is attributed. This exemplifies a prevalent type of imprecise measurement called grouped failure time, where times or doses are restricted to discrete increments. Failure to appropriately account for the grouped nature of the data, when present, may lead to biased analyses. Results We present groupedSurv, an R package which implements a statistically rigorous and computationally efficient approach for conducting genome-wide analyses based on grouped failure time phenotypes. Our approach accommodates adjustments for baseline covariates, and analysis at the variant or gene level. We illustrate the statistical properties of the approach and computational performance of the package by simulation. We present the results of a reanalysis of a published genome-wide study to identify common germline variants associated with the risk of taxane-induced peripheral neuropathy in breast cancer patients. Conclusions groupedSurv enables fast and rigorous genome-wide analysis on the basis of grouped failure time phenotypes at the variant, gene or pathway level. The package is freely available under a public license through the Comprehensive R Archive Network. Electronic supplementary material The online version of this article (10.1186/s12859-019-2899-x) contains supplementary material, which is available to authorized users.
Databáze: OpenAIRE