The EyCon Dataset: A Visual Corpus of Early Conflict Photography

Autor: Marina Giardinetti, Daniel Foliard, Julien Schuh, Mohamed-Salim Aissi
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Journal of Open Humanities Data, Vol 10, Pp 40-40 (2024)
Druh dokumentu: article
ISSN: 2059-481X
DOI: 10.5334/johd.213
Popis: The EyCon dataset, comprising nearly 130,000 JPEG images and pages, documents armed conflicts from the 1890s to 1918, with a focus on extra-European contexts. The project team aggregated thousands of digitized images and metadata from various institutions, including previously inaccessible documents. To enhance metadata, the team conducted visual and multimodal similarity analyses, as well as human and animal detection. Captions were processed to extract named entities for XML-formatted descriptive metadata. Challenges in identifying and publishing graphic images due to automated tools’ limitations in detecting violence were addressed with human expertise for accurate classification. Available online and on Zenodo for download and reuse, the dataset confronts issues in computer vision for heritage photographs, such as degradation from fading, discoloration, scratches and noise, which impair algorithms reliant on visual features. The under-representation of early photographic cultures in datasets introduces bias in applying standard solutions to archival materials.
Databáze: Directory of Open Access Journals