Image-Text Surgery: Efficient Concept Learning in Image Captioning by Generating Pseudopairs
Autor: | Changshui Zhang, Jin Li, Junqi Jin, Kun Fu |
---|---|
Rok vydání: | 2018 |
Předmět: |
Closed captioning
medicine.medical_specialty Artificial neural network Syntax (programming languages) Computer Networks and Communications Computer science 020207 software engineering 02 engineering and technology Computer Science Applications Visualization Data modeling Surgery Knowledge-based systems Artificial Intelligence 0202 electrical engineering electronic engineering information engineering medicine 020201 artificial intelligence & image processing F1 score Software Natural language |
Zdroj: | IEEE Transactions on Neural Networks and Learning Systems. 29:5910-5921 |
ISSN: | 2162-2388 2162-237X |
DOI: | 10.1109/tnnls.2018.2813306 |
Popis: | Image captioning aims to generate natural language sentences to describe the salient parts of a given image. Although neural networks have recently achieved promising results, a key problem is that they can only describe concepts seen in the training image-sentence pairs. Efficient learning of novel concepts has thus been a topic of recent interest to alleviate the expensive manpower of labeling data. In this paper, we propose a novel method, Image-Text Surgery , to synthesize pseudoimage-sentence pairs. The pseudopairs are generated under the guidance of a knowledge base, with syntax from a seed data set (i.e., MSCOCO) and visual information from an existing large-scale image base (i.e., ImageNet). Via pseudodata, the captioning model learns novel concepts without any corresponding human-labeled pairs. We further introduce adaptive visual replacement, which adaptively filters unnecessary visual features in pseudodata with an attention mechanism. We evaluate our approach on a held-out subset of the MSCOCO data set. The experimental results demonstrate that the proposed approach provides significant performance improvements over state-of-the-art methods in terms of F1 score and sentence quality. An ablation study and the qualitative results further validate the effectiveness of our approach. |
Databáze: | OpenAIRE |
Externí odkaz: |