Abstrakt: |
Label distribution learning (LDL) is the state-of-the-art approach to dealing with a number of real-world applications, such as chronological age estimation from a face image, where there is an inherent similarity among adjacent age labels. LDL takes into account the semantic similarity by assigning a label distribution to each instance. The well-known Kullback-Leibler (KL) divergence is the widely used loss function for the LDL framework. However, the KL divergence does not fully and effectively capture the semantic similarity among age labels, thus leading to suboptimal performance. In this article, we propose a novel loss function based on the optimal transport theory for the LDL-based age estimation. A ground metric function plays an important role in the optimal transport formulation. It should be carefully determined based on the underlying geometric structure of the label space of the application in-hand. The label space in the age estimation problem has a specific geometric structure, that is, closer ages have more inherent semantic relationships. Inspired by this, we devise a novel ground metric function, which enables the loss function to increase the influence of highly correlated ages; thus exploiting the semantic similarity among ages more effectively than the existing loss functions. We then use the proposed loss function, namely, γ -Wasserstein loss, for training a deep neural network (DNN). This leads to a notoriously computationally expensive and nonconvex optimization problem. Following the standard methodology, we formulate the optimization function as a convex problem and then use an efficient iterative algorithm to update the parameters of the DNN. Extensive experiments in age estimation on different benchmark datasets validate the effectiveness of the proposed method, which consistently outperforms state-of-the-art approaches. |