Clip-Level Feature Aggregation: A Key Factor for Video-Based Person Re-identification
Autor: | Bart Goossens, Ljiljana Platisa, Wilfried Philips, Chengjin Lyu, Patrick Heyer-Wollenberg, Peter Veelaert |
---|---|
Přispěvatelé: | Blanc-Talon, Jacques, Delmas, Patrice, Philips, Wilfried, Popescu, Dan, Scheunders, Paul |
Rok vydání: | 2020 |
Předmět: |
Normalization (statistics)
Technology and Engineering business.industry Computer science 010401 analytical chemistry Aggregate (data warehouse) Convolutional neural network Pattern recognition 02 engineering and technology 01 natural sciences 0104 chemical sciences Task (computing) Person re-identification Feature (computer vision) Factor (programming language) 0202 electrical engineering electronic engineering information engineering Key (cryptography) 020201 artificial intelligence & image processing Artificial intelligence Feature aggregation business Representation (mathematics) computer computer.programming_language |
Zdroj: | Advanced Concepts for Intelligent Vision Systems-20th International Conference, ACIVS 2020, Auckland, New Zealand, February 10–14, 2020, Proceedings Advanced Concepts for Intelligent Vision Systems ISBN: 9783030406042 ACIVS Advanced concepts for intelligent vision systems-ACIVS 2020 Lecture Notes in Computer Science Lecture Notes in Computer Science-Advanced Concepts for Intelligent Vision Systems |
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/978-3-030-40605-9_16 |
Popis: | In the task of video-based person re-identification, features of persons in the query and gallery sets are compared to search the best match. Generally, most existing methods aggregate the frame-level features together using a temporal method to generate the clip-level fea- tures, instead of the sequence-level representations. In this paper, we propose a new method that aggregates the clip-level features to obtain the sequence-level representations of persons, which consists of two parts, i.e., Average Aggregation Strategy (AAS) and Raw Feature Utilization (RFU). AAS makes use of all frames in a video sequence to generate a better representation of a person, while RFU investigates how batch normalization operation influences feature representations in person re- identification. The experimental results demonstrate that our method can boost the performance of existing models for better accuracy. In particular, we achieve 87.7% rank-1 and 82.3% mAP on MARS dataset without any post-processing procedure, which outperforms the existing state-of-the-art. |
Databáze: | OpenAIRE |
Externí odkaz: |