Detecting potential outliers in longitudinal data with time-dependent covariates.

Autor: Mramba LK; Health Informatics Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA. Lazarus.Mramba@epi.usf.edu., Liu X; Health Informatics Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA., Lynch KF; Health Informatics Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA., Yang J; Health Informatics Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA., Aronsson CA; Department of Clinical Sciences, Lund University, Malmö, Sweden.; Department of Pediatrics, Skåne University Hospital, Malmö, Sweden., Hummel S; Institute of Diabetes Research, Helmholtz Zentrum and Forschergruppe Diabetes, Klinikum rechts der Isar, Technische Universität and Forschergruppe Diabetes e.V, Munich, Germany., Norris JM; Department of Epidemiology, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, USA., Virtanen SM; Finnish Institute for Health and Welfare, Health and Well-Being Promotion Unit, Helsinki, Finland.; Center for Child Health Research, University of Tampere and Tampere University Hospital, Tampere, Finland.; Faculty of Social Sciences, Unit of Health Sciences, Tampere University, Tampere, Finland.; Tampere University Hospital, Wellbeing Services County of Pirkanmaa, Tampere, Finland., Hakola L; Faculty of Social Sciences, Unit of Health Sciences, Tampere University, Tampere, Finland.; Tampere University Hospital, Wellbeing Services County of Pirkanmaa, Tampere, Finland., Uusitalo UM; Health Informatics Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA., Krischer JP; Health Informatics Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
Jazyk: angličtina
Zdroj: European journal of clinical nutrition [Eur J Clin Nutr] 2024 Apr; Vol. 78 (4), pp. 344-350. Date of Electronic Publication: 2024 Jan 03.
DOI: 10.1038/s41430-023-01393-6
Abstrakt: Background: Outliers can influence regression model parameters and change the direction of the estimated effect, over-estimating or under-estimating the strength of the association between a response variable and an exposure of interest. Identifying visit-level outliers from longitudinal data with continuous time-dependent covariates is important when the distribution of such variable is highly skewed.
Objectives: The primary objective was to identify potential outliers at follow-up visits using interquartile range (IQR) statistic and assess their influence on estimated Cox regression parameters.
Methods: Study was motivated by a large TEDDY dietary longitudinal and time-to-event data with a continuous time-varying vitamin B 12 intake as the exposure of interest and development of Islet Autoimmunity (IA) as the response variable. An IQR algorithm was applied to the TEDDY dataset to detect potential outliers at each visit. To assess the impact of detected outliers, data were analyzed using the extended time-dependent Cox model with robust sandwich estimator. Partial residual diagnostic plots were examined for highly influential outliers.
Results: Extreme vitamin B 12 observations that were cases of IA had a stronger influence on the Cox regression model than non-cases. Identified outliers changed the direction of hazard ratios, standard errors, or the strength of association with the risk of developing IA.
Conclusion: At the exploratory data analysis stage, the IQR algorithm can be used as a data quality control tool to identify potential outliers at the visit level, which can be further investigated.
(© 2023. The Author(s), under exclusive licence to Springer Nature Limited.)
Databáze: MEDLINE