Grounded Compositional Semantics for Finding and Describing Images with Sentences

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Related collections

Most cited references 4

Record: found
Abstract: found
Article: not found

From Frequency to Meaning: Vector Space Models of Semantics

P. Turney, P Pantel (2010)

Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term-document, word-context, and pair-pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field.

0 comments Cited 155 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Composition in distributional models of semantics.

Jeff Mitchell, Mirella Lapata (2010)

Vector-based models of word meaning have become increasingly popular in cognitive science. The appeal of these models lies in their ability to represent meaning simply by using distributional information under the assumption that words occurring within similar contexts are semantically similar. Despite their widespread use, vector-based models are typically directed at representing words in isolation, and methods for constructing representations for phrases or sentences have received little attention in the literature. This is in marked contrast to experimental evidence (e.g., in sentential priming) suggesting that semantic similarity is more complex than simply a relation between isolated words. This article proposes a framework for representing the meaning of word combinations in vector space. Central to our approach is vector composition, which we operationalize in terms of additive and multiplicative functions. Under this framework, we introduce a wide range of composition models that we evaluate empirically on a phrase similarity task.

0 comments Cited 139 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics

P. Young, M. Hodosh, J. Hockenmaier (2013)

The ability to associate images with natural language sentences that describe what is depicted in them is a hallmark of image understanding, and a prerequisite for applications such as sentence-based image search. In analogy to image search, we propose to frame sentence-based image annotation as the task of ranking a given pool of captions. We introduce a new benchmark collection for sentence-based image description and search, consisting of 8,000 images that are each paired with five different captions which provide clear descriptions of the salient entities and events. We introduce a number of systems that perform quite well on this task, even though they are only based on features that can be obtained with minimal supervision. Our results clearly indicate the importance of training on multiple captions per image, and of capturing syntactic (word order-based) and semantic features of these captions. We also perform an in-depth comparison of human and automatic evaluation metrics for this task, and propose strategies for collecting human judgments cheaply and on a very large scale, allowing us to augment our collection with additional relevance judgments of which captions describe which image. Our analysis shows that metrics that consider the ranked list of results for each query image or sentence are significantly more robust than metrics that are based on a single response per query. Moreover, our study suggests that the evaluation of ranking-based image description systems may be fully automated.

0 comments Cited 109 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Title: Transactions of the Association for Computational Linguistics

Abbreviated Title: Transactions of the Association for Computational Linguistics

Publisher: MIT Press - Journals

ISSN (Electronic): 2307-387X

Publication date Created: December 2014

Publication date (Print): December 2014

Volume: 2

Pages: 207-218

Affiliations

[1 ]Stanford University, Computer Science Department,

[2 ]Google Inc.,

Article

DOI: 10.1162/tacl_a_00177

SO-VID: 8bfd4048-2243-4668-9bc9-4ed851980b61

History

Data availability:

Grounded Compositional Semantics for Finding and Describing Images with Sentences

Read this article at

Related collections

ScienceOpen Research

Most cited references 4

From Frequency to Meaning: Vector Space Models of Semantics

Composition in distributional models of semantics.

Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics

Author and article information

Journal

Affiliations

Article

History

Comments

Comment on this article

Similar content 2,588

Cited by 48

Most referenced authors 23