Deconstructing and reconstructing word embedding algorithms

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Uncontextualized word embeddings are reliable feature representations of words used to obtain high quality results for various NLP applications. Given the historical success of word embeddings in NLP, we propose a retrospective on some of the most well-known word embedding algorithms. In this work, we deconstruct Word2vec, GloVe, and others, into a common form, unveiling some of the necessary and sufficient conditions required for making performant word embeddings. We find that each algorithm: (1) fits vector-covector dot products to approximate pointwise mutual information (PMI); and, (2) modulates the loss gradient to balance weak and strong signals. We demonstrate that these two algorithmic features are sufficient conditions to construct a novel word embedding algorithm, Hilbert-MLE. We find that its embeddings obtain equivalent or better performance against other algorithms across 17 intrinsic and extrinsic datasets.

Related collections

Author and article information

Journal

Publication date Created: 29 November 2019

Article

ArXiV ID: 1911.13280

SO-VID: 343bf50d-2dcd-48a3-bbd4-3b59beb62f64

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Comments 15 pages

Categories cs.CL cs.LG

ScienceOpen disciplines: Theoretical computer science,Artificial intelligence

Data availability:

ScienceOpen disciplines: Theoretical computer science, Artificial intelligence

Deconstructing and reconstructing word embedding algorithms

Read this article at

Abstract

Related collections

Journal for the History of Environment and Society

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 171