The interest on using word embedding has expanded in various areas of text processing in recent years following the introduction of the word2vec model by Mikolov et al. (2013) and Pennington et al. (2014). The word embedding models use a large amount of text to create low dimensional representations of words capturing relationships between words without any external supervision. The resultant representation is shown to replicate many linguistic regularities such as, semantic similarity between terms, conceptual composition of terms, laws of analogy of terms. These features can be used for Information Retrieval (IR) where, the retrieval functions primarily depend on statistical co-occurrences.
Author and article information
Indian Statistical Institute, Kolkata
203, B.T. Road
Kolkata, India - 700108