9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Position-aware self-attention based neural sequence labeling

      , , , , ,
      Pattern Recognition
      Elsevier BV

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Related collections

          Most cited references47

          • Record: found
          • Abstract: found
          • Article: not found

          Long Short-Term Memory

          Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O(1). Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            A tutorial on hidden Markov models and selected applications in speech recognition

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Natural Language Processing (almost) from Scratch

              We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including: part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.
                Bookmark

                Author and article information

                Contributors
                Journal
                Pattern Recognition
                Pattern Recognition
                Elsevier BV
                00313203
                February 2021
                February 2021
                : 110
                : 107636
                Article
                10.1016/j.patcog.2020.107636
                f3fe763d-665c-42f8-bac0-95eee78d88ba
                © 2021

                https://www.elsevier.com/tdm/userlicense/1.0/

                History

                Comments

                Comment on this article