13
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Automatic Image Captioning Based on ResNet50 and LSTM with Soft Attention

      1 , 2 , 1 , 1 , 3
      Wireless Communications and Mobile Computing
      Hindawi Limited

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Captioning the images with proper descriptions automatically has become an interesting and challenging problem. In this paper, we present one joint model AICRL, which is able to conduct the automatic image captioning based on ResNet50 and LSTM with soft attention. AICRL consists of one encoder and one decoder. The encoder adopts ResNet50 based on the convolutional neural network, which creates an extensive representation of the given image by embedding it into a fixed length vector. The decoder is designed with LSTM, a recurrent neural network and a soft attention mechanism, to selectively focus the attention over certain parts of an image to predict the next sentence. We have trained AICRL over a big dataset MS COCO 2014 to maximize the likelihood of the target description sentence given the training images and evaluated it in various metrics like BLEU, METEROR, and CIDEr. Our experimental results indicate that AICRL is effective in generating captions for the images.

          Related collections

          Most cited references12

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Deep visual-semantic alignments for generating image descriptions

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics

              The ability to associate images with natural language sentences that describe what is depicted in them is a hallmark of image understanding, and a prerequisite for applications such as sentence-based image search. In analogy to image search, we propose to frame sentence-based image annotation as the task of ranking a given pool of captions. We introduce a new benchmark collection for sentence-based image description and search, consisting of 8,000 images that are each paired with five different captions which provide clear descriptions of the salient entities and events. We introduce a number of systems that perform quite well on this task, even though they are only based on features that can be obtained with minimal supervision. Our results clearly indicate the importance of training on multiple captions per image, and of capturing syntactic (word order-based) and semantic features of these captions. We also perform an in-depth comparison of human and automatic evaluation metrics for this task, and propose strategies for collecting human judgments cheaply and on a very large scale, allowing us to augment our collection with additional relevance judgments of which captions describe which image. Our analysis shows that metrics that consider the ranked list of results for each query image or sentence are significantly more robust than metrics that are based on a single response per query. Moreover, our study suggests that the evaluation of ranking-based image description systems may be fully automated.
                Bookmark

                Author and article information

                Contributors
                Journal
                Wireless Communications and Mobile Computing
                Wireless Communications and Mobile Computing
                Hindawi Limited
                1530-8677
                1530-8669
                October 20 2020
                October 20 2020
                : 2020
                : 1-7
                Affiliations
                [1 ]Harbin Engineering University, Harbin 150001, China
                [2 ]Zhongnan University of Economics and Law, Wuhan 430073, China
                [3 ]Singapore Institute of Technology, 138683, Singapore
                Article
                10.1155/2020/8909458
                b45cb3b1-1e60-482f-824d-e9ceb96476c0
                © 2020

                https://creativecommons.org/licenses/by/4.0/

                History

                Comments

                Comment on this article