12
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Generating Natural Questions About an Image

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          There has been an explosion of work in the vision & language community during the past few years from image captioning to video transcription, and answering questions about images. These tasks focus on literal descriptions of the image. To move beyond the literal, we choose to explore how questions about an image often address abstract events that the objects evoke. In this paper, we introduce the novel task of 'Visual Question Generation (VQG)', where the system is tasked with asking a natural and engaging question when shown an image. We provide three datasets which cover a variety of images from object-centric to event-centric, providing different and more abstract training data than the state-of-the-art captioning systems have used thus far. We train and test several generative and retrieval models to tackle the task of VQG. Evaluation results show that while such models ask reasonable questions given various images, there is still a wide gap with human performance. Our proposed task offers a new challenge to the community which we hope can spur further interest in exploring deeper connections between vision & language.

          Related collections

          Most cited references12

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          2D Human Pose Estimation: New Benchmark and State of the Art Analysis

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Meteor Universal: Language Specific Translation Evaluation for Any Target Language

              Bookmark
              • Record: found
              • Abstract: not found
              • Book Chapter: not found

              Every Picture Tells a Story: Generating Sentences from Images

                Bookmark

                Author and article information

                Journal
                2016-03-19
                2016-03-22
                Article
                1603.06059
                3c7e6115-226a-453d-9ce8-87f60bfd9907

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                cs.CL cs.AI cs.CV

                Computer vision & Pattern recognition,Theoretical computer science,Artificial intelligence

                Comments

                Comment on this article