Popis: |
Different versions of principal component analysis (PCA) have been widely used to extract important information for image recognition and image clustering problems. However, owing to the presence of outliers, this remains challenging. This paper proposes a new PCA methodology based on a novel discovery that the widely used l 1 -PCA is equivalent to a two-groups k -means clustering model. The projection vector of the l 1 -PCA is the vector difference between the two cluster centers estimated by the clustering model. In theory, this vector difference provides inter-cluster information, which is beneficial for distinguishing data objects from different classes. However, the performance of l 1 -PCA is not comparable with the state-of-the-art methods. This is because the l 1 -PCA can be sensitive to outliers, as the equivalent clustering model is not robust to outliers. To overcome this limitation, we introduce a trimming function to the clustering model and propose a trimmed-clustering based l 1 -PCA (TC-PCA). With this trimming set formulation, the TC-PCA is not sensitive to outliers. Besides, we mathematically prove the convergence of the proposed algorithm. Experimental results on image classification and clustering indicate that our proposed method outperforms the current state-of-the-art methods. |