Blog
About

13
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Machine Learning on Sequential Data Using a Recurrent Weighted Average

      Preprint

      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Recurrent Neural Networks (RNN) are a type of statistical model designed to handle sequential data. The model reads a sequence one symbol at a time. Each symbol is processed based on information collected from the previous symbols. With existing RNN architectures, each symbol is processed using only information from the previous processing step. To overcome this limitation, we propose a new kind of RNN model that computes a recurrent weighted average (RWA) over every past processing step. Because the RWA can be computed as a running average, the computational overhead scales like that of any other RNN. The approach essentially reformulates the attention mechanism into a stand-alone model. When assessing a RWA model, it is found to train faster and generalize better than a standard LSTM model when performing the variable copy problem, the adding problem, classification of artificial grammar, classification of sequences by length, and classification of MNIST handwritten digits (where the pixels are read sequentially one at a time).

          Related collections

          Most cited references 2

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Show and tell: A neural image caption generator

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Convolutional LSTM Networks for Subcellular Localization of Proteins

            Machine learning is widely used to analyze biological sequence data. Non-sequential models such as SVMs or feed-forward neural networks are often used although they have no natural way of handling sequences of varying length. Recurrent neural networks such as the long short term memory (LSTM) model on the other hand are designed to handle sequences. In this study we demonstrate that LSTM networks predict the subcellular location of proteins given only the protein sequence with high accuracy (0.902) outperforming current state of the art algorithms. We further improve the performance by introducing convolutional filters and experiment with an attention mechanism which lets the LSTM focus on specific parts of the protein. Lastly we introduce new visualizations of both the convolutional filters and the attention mechanisms and show how they can be used to extract biological relevant knowledge from the LSTM networks.
              Bookmark

              Author and article information

              Journal
              2017-03-03
              Article
              1703.01253

              http://creativecommons.org/licenses/by/4.0/

              Custom metadata
              stat.ML cs.LG

              Machine learning, Artificial intelligence

              Comments

              Comment on this article