Autor: |
Ruta, Dan, Gilbert, Andrew, Aggarwal, Pranav, Marri, Naveen, Kale, Ajinkya, Briggs, Jo, Speed, Chris, Jin, Hailin, Faieta, Baldo, Filipkowski, Alex, Lin, Zhe, Collomosse, John |
Přispěvatelé: |
Avidan, Shai, Brostow, Gabriel, Cissé, Moustapha, Farinella, Giovanni Maria |
Jazyk: |
angličtina |
Rok vydání: |
2022 |
Předmět: |
|
Zdroj: |
Ruta, D, Gilbert, A, Aggarwal, P, Marri, N, Kale, A, Briggs, J, Speed, C, Jin, H, Faieta, B, Filipkowski, A, Lin, Z & Collomosse, J 2022, StyleBabel : Artistic style tagging and captioning . in S Avidan, G Brostow, M Cissé & G M Farinella (eds), Computer Vision – ECCV 2022 . Lecture Notes in Computer Science, pp. 219-236, European Conference on Computer Vision 2022, Tel Aviv, Israel, 23/10/22 . https://doi.org/10.1007/978-3-031-20074-8_13 |
DOI: |
10.1007/978-3-031-20074-8_13 |
Popis: |
We present StyleBabel, a unique open access dataset of natural language captions and free-form tags describing the artistic style of over 135K digital artworks, collected via a novel participatory method from experts studying at specialist art and design schools. StyleBabel was collected via an iterative method, inspired by ‘Grounded Theory’: a qualitative approach that enables annotation while co-evolving a shared language for fine-grained artistic style attribute description. We demonstrate several downstream tasks for StyleBabel, adapting the recent ALADIN architecture for fine-grained style similarity, to train cross-modal embeddings for: 1) free-form tag generation; 2) natural language description of artistic style; 3) fine-grained text search of style. To do so, we extend ALADIN with recent advances in Visual Transformer (ViT) and cross-modal representation learning, achieving a state of the art accuracy in fine-grained style retrieval. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|