Comparison of Logistic Regression and Neural Net Modeling for Prediction of Prostate Cancer Pathologic Stage

Autor: M. Craig Miller, Gerard J. O'Dowd, Edward C. Poole, Manisha Chaudhari, Robert W. Veltri, Alan W. Partin
Rok vydání: 2002
Předmět:
Zdroj: Clinical Chemistry. 48:1828-1834
ISSN: 1530-8561
0009-9147
Popis: Background: Prostate cancer (PCa) pathologic staging remains a challenge for the physician using individual pretreatment variables. We have previously reported that UroScoreTM, a logistic regression (LR)-derived algorithm, can correctly predict organ-confined (OC) disease state with >90% accuracy. This study compares statistical and neural network (NN) approaches to predict PCa stage. Methods: A subset (756 of 817) of radical prostatectomy patients was assessed: 434 with OC disease, 173 with capsular penetration (NOC-CP), and 149 with metastases (NOC-AD) in the training sample. Additionally, an OC + NOC-CP (n = 607) vs NOC-AD (n = 149) two-outcome model was prepared. Validation sets included 120 or 397 cases not used for modeling. Input variables included clinical and several quantitative biopsy pathology variables. The classification accuracies achieved with a NN with an error back-propagation architecture were compared with those of LR statistical modeling. Results: We demonstrated >95% detection of OC PCa in three-outcome models, using both computational approaches. For training patient samples that were equally distributed for the three-outcome models, NNs gave a significantly higher overall classification accuracy than the LR approach (40% vs 96%, respectively). In the two-outcome models using either unequal or equal case distribution, the NNs had only a marginal advantage in classification accuracy over LR. Conclusions: The strength of a mathematics-based disease-outcome model depends on the quality of the input variables, quantity of cases, case sample input distribution, and computational methods of data processing of inputs and outputs. We identified specific advantages for NNs, especially in the prediction of multiple-outcome models, related to the ability to pre- and postprocess inputs and outputs.
Databáze: OpenAIRE