Feature Importance in Nonlinear Embeddings (FINE): Applications in Digital Pathology
Autor: | Shoshana B. Ginsburg, George Lee, Sahirzeeshan Ali, Anant Madabhushi |
---|---|
Rok vydání: | 2016 |
Předmět: |
Diagnostic Imaging
Male Biopsy Feature vector Feature extraction Breast Neoplasms Feature selection 02 engineering and technology Machine learning computer.software_genre Kernel principal component analysis 030218 nuclear medicine & medical imaging 03 medical and health sciences 0302 clinical medicine Recurrence Scoring algorithm Image Interpretation Computer-Assisted Pathology 0202 electrical engineering electronic engineering information engineering Humans Electrical and Electronic Engineering Mathematics Radiological and Ultrasound Technology Histocytochemistry business.industry Dimensionality reduction Prostatic Neoplasms Digital pathology Pattern recognition Computer Science Applications Nonlinear Dynamics Feature (computer vision) Female 020201 artificial intelligence & image processing Artificial intelligence business computer Algorithms Software |
Zdroj: | IEEE Transactions on Medical Imaging. 35:76-88 |
ISSN: | 1558-254X 0278-0062 |
DOI: | 10.1109/tmi.2015.2456188 |
Popis: | Quantitative histomorphometry (QH) refers to the process of computationally modeling disease appearance on digital pathology images by extracting hundreds of image features and using them to predict disease presence or outcome. Since constructing a robust and interpretable classifier is challenging in a high dimensional feature space, dimensionality reduction (DR) is often implemented prior to classifier construction. However, when DR is performed it can be challenging to quantify the contribution of each of the original features to the final classification result. We have previously presented a method for scoring features based on their importance for classification on an embedding derived via principal components analysis (PCA). However, nonlinear DR involves the eigen-decomposition of a kernel matrix rather than the data itself, compounding the issue of classifier interpretability. In this paper we present feature importance in nonlinear embeddings (FINE), an extension of our PCA-based feature scoring method to kernel PCA (KPCA), as well as several NLDR algorithms that can be cast as variants of KPCA. FINE is applied to four digital pathology datasets to identify key QH features for predicting the risk of breast and prostate cancer recurrence. Measures of nuclear and glandular architecture and clusteredness were found to play an important role in predicting the likelihood of recurrence of both breast and prostate cancers. Compared to the t-test, Fisher score, and Gini index, FINE was able to identify a stable set of features that provide good classification accuracy on four publicly available datasets from the NIPS 2003 Feature Selection Challenge. |
Databáze: | OpenAIRE |
Externí odkaz: |