Autor: |
JINHUI TANG, JING WANG, ZECHAO LI, JIANLONG FU, TAO MEI |
Zdroj: |
ACM Transactions on Multimedia Computing, Communications & Applications; Jul2019, Vol. 15 Issue 2s, p1-20, 20p |
Abstrakt: |
Despite the promising progress made in visual captioning and paragraphing, visual storytelling is still largely unexplored. This task is more challenging due to the difficulty in modeling an ordered photo sequence and in generating a relevant paragraph with expressive language style for storytelling. To deal with these challenges, we propose an Attribute-based Hierarchical Generative model with Reinforcement Learning and adversarial training (AHGRL). First, tomodel the ordered photo sequence and the complex story structure, we propose an attribute-based hierarchical generator. The generator incorporates semantic attributes to create more accurate and relevant descriptions. The hierarchical framework enables the generator to learn from the complex paragraph structure. Second, to generate story-style paragraphs, we design a language-style discriminator, which provides word-level rewards to optimize the generator by policy gradient. Third, we further consider the story generator and the reward critic as adversaries. The generator aims to create indistinguishable paragraphs to human-level stories, whereas the critic aims at distinguishing them and further improving the generator. Extensive experiments on the widely used dataset well demonstrate the advantages of the proposed method over state-of-the-art methods. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|