Feature Importance in Nonlinear Embeddings (FINE): Applications in Digital Pathology

Autor: Shoshana B. Ginsburg, George Lee, Sahirzeeshan Ali, Anant Madabhushi
Rok vydání: 2016
Předmět:
Diagnostic Imaging
Male
Biopsy
Feature vector
Feature extraction
Breast Neoplasms
Feature selection
02 engineering and technology
Machine learning
computer.software_genre
Kernel principal component analysis
030218 nuclear medicine & medical imaging
03 medical and health sciences
0302 clinical medicine
Recurrence
Scoring algorithm
Image Interpretation
Computer-Assisted

Pathology
0202 electrical engineering
electronic engineering
information engineering

Humans
Electrical and Electronic Engineering
Mathematics
Radiological and Ultrasound Technology
Histocytochemistry
business.industry
Dimensionality reduction
Prostatic Neoplasms
Digital pathology
Pattern recognition
Computer Science Applications
Nonlinear Dynamics
Feature (computer vision)
Female
020201 artificial intelligence & image processing
Artificial intelligence
business
computer
Algorithms
Software
Zdroj: IEEE Transactions on Medical Imaging. 35:76-88
ISSN: 1558-254X
0278-0062
DOI: 10.1109/tmi.2015.2456188
Popis: Quantitative histomorphometry (QH) refers to the process of computationally modeling disease appearance on digital pathology images by extracting hundreds of image features and using them to predict disease presence or outcome. Since constructing a robust and interpretable classifier is challenging in a high dimensional feature space, dimensionality reduction (DR) is often implemented prior to classifier construction. However, when DR is performed it can be challenging to quantify the contribution of each of the original features to the final classification result. We have previously presented a method for scoring features based on their importance for classification on an embedding derived via principal components analysis (PCA). However, nonlinear DR involves the eigen-decomposition of a kernel matrix rather than the data itself, compounding the issue of classifier interpretability. In this paper we present feature importance in nonlinear embeddings (FINE), an extension of our PCA-based feature scoring method to kernel PCA (KPCA), as well as several NLDR algorithms that can be cast as variants of KPCA. FINE is applied to four digital pathology datasets to identify key QH features for predicting the risk of breast and prostate cancer recurrence. Measures of nuclear and glandular architecture and clusteredness were found to play an important role in predicting the likelihood of recurrence of both breast and prostate cancers. Compared to the t-test, Fisher score, and Gini index, FINE was able to identify a stable set of features that provide good classification accuracy on four publicly available datasets from the NIPS 2003 Feature Selection Challenge.
Databáze: OpenAIRE