ScienceOpen: research and publishing network

For Researchers

Search
Advanced search

12

views

    

0

recommends

0

shares

Record: found
Abstract: found
Article: found

Is Open Access

Generating Natural Questions About an Image

Preprint

Author(s): Nasrin Mostafazadeh , Ishan Misra , Jacob Devlin , Larry Zitnick , Margaret Mitchell , Xiaodong He , Lucy Vanderwende

Publication date Created: 2016-03-19

Read this article at

ScienceOpen ArXiv

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

There has been an explosion of work in the vision & language community during the past few years from image captioning to video transcription, and answering questions about images. These tasks focus on literal descriptions of the image. To move beyond the literal, we choose to explore how questions about an image often address abstract events that the objects evoke. In this paper, we introduce the novel task of 'Visual Question Generation (VQG)', where the system is tasked with asking a natural and engaging question when shown an image. We provide three datasets which cover a variety of images from object-centric to event-centric, providing different and more abstract training data than the state-of-the-art captioning systems have used thus far. We train and test several generative and retrieval models to tackle the task of VQG. Evaluation results show that while such models ask reasonable questions given various images, there is still a wide gap with human performance. Our proposed task offers a new challenge to the community which we hope can spur further interest in exploring deeper connections between vision & language.

Related collections

Most cited references 12

Record: found
Abstract: not found
Conference Proceedings: not found

2D Human Pose Estimation: New Benchmark and State of the Art Analysis

Bernt Schiele, Peter Gehler, Mykhaylo Andriluka … (2014)

0 comments Cited 199 times – based on 0 reviews

Record: found
Abstract: not found
Conference Proceedings: not found

Meteor Universal: Language Specific Translation Evaluation for Any Target Language

Michael Denkowski, Alon Lavie (2014)

0 comments Cited 195 times – based on 0 reviews

Record: found
Abstract: not found
Book Chapter: not found

Every Picture Tells a Story: Generating Sentences from Images

Ali Farhadi, Mohsen Hejrati, Mohammad Amin Sadeghi … (2010)

0 comments Cited 103 times – based on 0 reviews

Author and article information

Journal

Publication date Created: 2016-03-19

Publication date Updated: 2016-03-22

Article

ArXiV ID: 1603.06059

SO-VID: 3c7e6115-226a-453d-9ce8-87f60bfd9907

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.CL cs.AI cs.CV

ScienceOpen disciplines: Computer vision & Pattern recognition,Theoretical computer science,Artificial intelligence

Data availability:

ScienceOpen disciplines: Computer vision & Pattern recognition, Theoretical computer science, Artificial intelligence

Comments

Comment on this article

Similar content 167

See all similar

Cited by 3

See all cited by

Most referenced authors 247

See all reference authors

- Version 1
- Version 1