Popis: |
Person re-identification (re-ID) is an important topic in computer vision. In this paper, we study the unsupervised person re-ID which aims to identify target identity across multiple non-overlapping cameras for intelligent surveillance systems. The main challenge of unsupervised person re-ID lies in how to learn discriminative features without leveraging any annotated data. In this paper, we apply the Vision Transformer (ViT) to unsupervised person re-identification (re-ID) task. Combined with Multi-label Classification, the performance outperforms most CNN-based methods. We evaluate the proposed model on Market-1501, DukeMTMC-reID and MSMT17 and achieves 56.6%, 49.4%, 14.5% in mAP, respectively, which outperforms the baseline by a clear margin and achieves the state-of-the-art unsupervised re-ID methods. |