What are the implications of using individual and combined sources of routinely collected data to identify and characterise incident site-specific cancers? a concordance and validation study using linked English electronic health records data.

Autor: Strongman H; Department of Non-communicable Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK helen.strongman@lshtm.ac.uk., Williams R; Clinical Practice Research Datalink (CPRD), Medicines and Healthcare Products Regulatory Agency, London, UK., Bhaskaran K; Department of Non-communicable Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK.
Jazyk: angličtina
Zdroj: BMJ open [BMJ Open] 2020 Aug 20; Vol. 10 (8), pp. e037719. Date of Electronic Publication: 2020 Aug 20.
DOI: 10.1136/bmjopen-2020-037719
Abstrakt: Objectives: To describe the benefits and limitations of using individual and combinations of linked English electronic health data to identify incident cancers.
Design and Setting: Our descriptive study uses linked English Clinical Practice Research Datalink primary care; cancer registration; hospitalisation and death registration data.
Participants and Measures: We implemented case definitions to identify first site-specific cancers at the 20 most common sites, based on the first ever cancer diagnosis recorded in each individual or commonly used combination of data sources between 2000 and 2014. We calculated positive predictive values and sensitivities of each definition, compared with a gold standard algorithm that used information from all linked data sets to identify first cancers. We described completeness of grade and stage information in the cancer registration data set.
Results: 165 953 gold standard cancers were identified. Positive predictive values of all case definitions were ≥80% and ≥94% for the four most common cancers (breast, lung, colorectal and prostate). Sensitivity for case definitions that used cancer registration alone or in combination was ≥92% for the four most common cancers and ≥80% across all cancer sites except bladder cancer (65% using cancer registration alone). For case definitions using linked primary care, hospitalisation and death registration data, sensitivity was ≥89% for the four most common cancers, and ≥80% for all cancer sites except kidney (69%), oral cavity (76%) and ovarian cancer (78%). When primary care or hospitalisation data were used alone, sensitivities were generally lower and diagnosis dates were delayed. Completeness of staging data in cancer registration data was high from 2012 (minimum 76.0% in 2012 and 86.4% in 2014 for the four most common cancers).
Conclusions: Ascertainment of incident cancers was good when using cancer registration data alone or in combination with other data sets, and for the majority of cancers when using a combination of primary care, hospitalisation and death registration data.
Competing Interests: Competing interests: RW is employed by Clinical Practice Research Datalink. HS and KB have academic honorary contracts at Public Health England for a separate collaborative research study.
(© Author(s) (or their employer(s)) 2020. Re-use permitted under CC BY. Published by BMJ.)
Databáze: MEDLINE