Popis: |
As Transformer-based architectures have recently shown encouraging progresses in computer vision. In this work, we present the solution to the Google Landmark Recognition 2021 Challenge held on Kaggle, which is an improvement on our last year's solution by changing three designs, including (1) Using Swin and CSWin as backbone for feature extraction, (2) Train on full GLDv2, and (3) Using full GLDv2 images as index image set for kNN search. With these modifications, our solution significantly improves last year solution on this year competition. Our full pipeline, after ensembling Swin, CSWin, EfficientNet B7 models, scores 0.4907 on the private leaderboard which help us to get the 2nd place in the competition. |