Learning Compact Reward for Image Captioning

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Adversarial learning has shown its advances in generating natural and diverse descriptions in image captioning. However, the learned reward of existing adversarial methods is vague and ill-defined due to the reward ambiguity problem. In this paper, we propose a refined Adversarial Inverse Reinforcement Learning (rAIRL) method to handle the reward ambiguity problem by disentangling reward for each word in a sentence, as well as achieve stable adversarial training by refining the loss function to shift the generator towards Nash equilibrium. In addition, we introduce a conditional term in the loss function to mitigate mode collapse and to increase the diversity of the generated descriptions. Our experiments on MS COCO and Flickr30K show that our method can learn compact reward for image captioning.

Related collections

Author and article information

Journal

Publication date Created: 24 March 2020

Article

ArXiV ID: 2003.10925

SO-VID: 8fc8c129-aca9-45cf-bfab-9f7174607d76

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Comments 13 pages, 10 figures

Categories cs.CV cs.CL

ScienceOpen disciplines: Computer vision & Pattern recognition,Theoretical computer science

Data availability:

ScienceOpen disciplines: Computer vision & Pattern recognition, Theoretical computer science

Learning Compact Reward for Image Captioning

Read this article at

Abstract

Related collections

Teaching and learning evolution

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 634