Quantitative imaging biomarkers: a review of statistical methods for computer algorithm comparisons.

Autor: Obuchowski NA; Cleveland Clinic Foundation, Cleveland, OH, USA obuchon@ccf.org., Reeves AP; Cornell University, Ithaca, NY, USA., Huang EP; National Institutes of Health, Rockville, MD, USA., Wang XF; Cleveland Clinic Foundation, Cleveland, OH, USA., Buckler AJ; Elucid Bioimaging Inc., Wenham, MA, USA., Kim HJ; University of California, Los Angeles, CA, USA., Barnhart HX; Duke University, Durham, NC, USA., Jackson EF; University of Wisconsin-Madison, Madison, WI, USA., Giger ML; University of Chicago, Chicago, IL, USA., Pennello G; Food and Drug Administration/CDRH, Silver Spring, MD, USA., Toledano AY; Biostatistics Consulting, LLC, Kensington, MD, USA., Kalpathy-Cramer J; MGH/Harvard Medical School, Boston, MA, USA., Apanasovich TV; George Washington University, NW Washington, DC, USA., Kinahan PE; University of Washington, Seattle, WA, USA., Myers KJ; Food and Drug Administration/CDRH, Silver Spring, MD, USA., Goldgof DB; University of South Florida, Tampa, FL, USA., Barboriak DP; Duke University, Durham, NC, USA., Gillies RJ; H. Moffitt Cancer Center, Tampa, FL, USA., Schwartz LH; Columbia University, New York, NY, USA., Sullivan DC; Duke University, Durham, NC, USA.
Jazyk: angličtina
Zdroj: Statistical methods in medical research [Stat Methods Med Res] 2015 Feb; Vol. 24 (1), pp. 68-106. Date of Electronic Publication: 2014 Jun 11.
DOI: 10.1177/0962280214537390
Abstrakt: Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research.
(© The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.)
Databáze: MEDLINE