A Theoretically Grounded Application of Dropout in Recurrent Neural
  Networks

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Recurrent neural networks (RNNs) stand at the forefront of many recent developments in deep learning. Yet a major difficulty with these models is their tendency to overfit. Dropout is a widely used tool for regularisation in deep models, but a long strand of empirical research has claimed that it cannot be applied between the recurrent connections of an RNN. The argument is that noise hinders the network's ability to model sequences and therefore dropout should be applied to the RNN's inputs and outputs alone. But without regularisation in recurrent layers, existing techniques overfit quickly. In this paper we make use of a recently developed theoretical framework casting dropout as approximate variational inference. Based on the framework we derive mathematically grounded tools to apply dropout within the recurrent layers of RNNs, eliminating model overfitting. We apply our new variational inference based dropout technique in LSTM and GRU networks, evaluating the technique empirically. We show that the new approach outperforms existing techniques on sentiment analysis and language modelling tasks, extending our arsenal of variational tools in deep learning.

Related collections

Author and article information

Journal

Publication date Created: 2015-12-16

Publication date Updated: 2016-02-11

Article

ArXiV ID: 1512.05287

SO-VID: 2366b442-d4e3-4da7-b2a6-a7256518411d

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories stat.ML

ScienceOpen disciplines: Machine learning

Data availability:

ScienceOpen disciplines: Machine learning

A Theoretically Grounded Application of Dropout in Recurrent Neural Networks

Read this article at

Abstract

Related collections

REPO4EU WP2 Tools

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 106