Abstrakt: |
This paper presents a method to capture human pose from individual real-world RGB images using a deep learning technique. The current works on estimating human pose by deep learning are designed in a detection or a regression framework, and in a part-based manner. As a new perspective, we introduce a classification scheme for this problem, which reasons the pose holistically. To the best of our knowledge, this is the first work for holistic human pose classification task that owes its feasibility to the great power of convolutional neural networks in feature learning. After training a convolutional neural network to classify the input image to one of the KeyPoses, the final pose is computed as a linear combination of several KeyPoses. In this new holistic classification attitude, the vast and high degree of freedom human pose space is divided into a finite number of subspaces and the convolutional neural network shows promising results in learning the features of each subspace. Empirical results (PCP and PCK rates) demonstrate that the proposed scheme is successfully able to understand human pose (i.e., predict a valid, true and coarse pose) in real-world unconstrained images with challenges like severe occlusion, high articulation, low quality and cluttered background. Furthermore, using the proposed method, the need for defining a complex model (such as appearance model or joints pairwise relations) is relieved. We have also verified a potential application of our proposed method in semantic image retrieval based on human pose. [ABSTRACT FROM AUTHOR] |