5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      An Online Attention-based Model for Speech Recognition

      Preprint
      , , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Attention-based end-to-end (E2E) speech recognition models such as Listen, Attend, and Spell (LAS) can achieve better results than traditional automatic speech recognition (ASR) hybrid models on LVCSR tasks. LAS combines acoustic, pronunciation and language model components of a traditional ASR system into a single neural network. However, such architectures are hard to be used for streaming speech recognition for its bidirectional listener architecture and attention mechanism. In this work, we propose to use latency-controlled bidirectional long short-term memory (LC- BLSTM) listener to reduce the delay of forward computing of listener. On the attention side, we propose an adaptive monotonic chunk-wise attention (AMoChA) to make LAS online. We explore how each part performs when it is used alone and obtain comparable or better results than LAS baseline. By combining the above two methods, we successfully stream LAS baseline with only 3.5% relative degradation of character error rate (CER) on our Mandarin corpus. We believe that our methods can also have the same effect on other languages.

          Related collections

          Most cited references4

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Highway long short-term memory RNNS for distant speech recognition

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Lower Frame Rate Neural Network Acoustic Models

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Towards Better Decoding and Language Model Integration in Sequence to Sequence Models

                Bookmark

                Author and article information

                Journal
                13 November 2018
                Article
                1811.05247
                7ef6fa4d-b05a-4ecc-9305-d6bfe6f1f280

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                cs.CL cs.LG cs.SD eess.AS

                Theoretical computer science,Artificial intelligence,Electrical engineering,Graphics & Multimedia design

                Comments

                Comment on this article