Dataset bias exposed in face verification

Autor: Carlos V. Regueiro, Roberto Iglesias, Xosé M. Pardo, Eric Lopez-Lopez, Fernando E. Casado
Přispěvatelé: Universidade de Santiago de Compostela. Centro de Investigación en Tecnoloxías da Información, Universidade de Santiago de Compostela. Departamento de Electrónica e Computación
Rok vydání: 2019
Předmět:
Zdroj: Minerva: Repositorio Institucional de la Universidad de Santiago de Compostela
Universidad de Santiago de Compostela (USC)
Minerva. Repositorio Institucional de la Universidad de Santiago de Compostela
instname
ISSN: 2047-4946
2047-4938
DOI: 10.1049/iet-bmt.2018.5224
Popis: This is the peer reviewed version of the following article: López‐López, E., Pardo, X.M., Regueiro, C.V., Iglesias, R. and Casado, F.E. (2019), Dataset bias exposed in face verification. IET Biom., 8: 249-258, which has been published in final form at https://doi.org/10.1049/iet-bmt.2018.5224. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions Most facial verification methods assume that training and testing sets contain independent and identically distributed samples, although, in many real applications, this assumption does not hold. Whenever gathering a representative dataset in the target domain is unfeasible, it is necessary to choose one of the already available (source domain) datasets. Here, a study was performed over the differences among six public datasets, and how this impacts on the performance of the learned methods. In the considered scenario of mobile devices, the individual of interest is enrolled using a few facial images taken in the operational domain, while training impostors are drawn from one of the public available datasets. This work tried to shed light on the inherent differences among the datasets, and potential harms that should be considered when they are combined for training and testing. Results indicate that a drop in performance occurs whenever training and testing are done on different datasets compared to the case of using the same dataset in both phases. However, the decay strongly depends on the kind of features. Besides, the representation of samples in the feature space reveals insights into what extent bias is an endogenous or an exogenous factor This work has received financial support from the Xunta de Galicia, Consellería de Cultura, Educación e Ordenación Universitaria (Accreditation 2016–2019, EDG431G/01 and ED431G/08, and reference competitive group 2014–2017, GRC2014/030), the European Union: European Social Fund (ESF), European Regional Development Fund (ERDF) and FEDER funds and (AEI/FEDER, UE) grant number TIN2017‐90135‐R. Eric López had received financial support from the Xunta de Galicia and the European Union (European Social Fund ‐ ESF) SI
Databáze: OpenAIRE