There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.
Abstract
<p class="first" id="d8292777e59">Learning to store information over extended time
intervals by recurrent backpropagation
takes a very long time, mostly because of insufficient, decaying error backflow. We
briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing
a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating
the gradient where this does not do harm, LSTM can learn to bridge minimal time lags
in excess of 1000 discrete-time steps by enforcing constant error flow through constant
error carousels within special units. Multiplicative gate units learn to open and
close access to the constant error flow. LSTM is local in space and time; its computational
complexity per time step and weight is O(1). Our experiments with artificial data
involve local, distributed, real-valued, and noisy pattern representations. In comparisons
with real-time recurrent learning, back propagation through time, recurrent cascade
correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful
runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks
that have never been solved by previous recurrent network algorithms.
</p>