46
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      End-to-End Subtitle Detection and Recognition for Videos in East Asian Languages via CNN Ensemble with Near-Human-Level Performance

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In this paper, we propose an innovative end-to-end subtitle detection and recognition system for videos in East Asian languages. Our end-to-end system consists of multiple stages. Subtitles are firstly detected by a novel image operator based on the sequence information of consecutive video frames. Then, an ensemble of Convolutional Neural Networks (CNNs) trained on synthetic data is adopted for detecting and recognizing East Asian characters. Finally, a dynamic programming approach leveraging language models is applied to constitute results of the entire body of text lines. The proposed system achieves average end-to-end accuracies of 98.2% and 98.3% on 40 videos in Simplified Chinese and 40 videos in Traditional Chinese respectively, which is a significant outperformance of other existing methods. The near-perfect accuracy of our system dramatically narrows the gap between human cognitive ability and state-of-the-art algorithms used for such a task.

          Related collections

          Most cited references8

          • Record: found
          • Abstract: not found
          • Article: not found

          Reading Text in the Wild with Convolutional Neural Networks

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Text Detection and Recognition in Imagery: A Survey

              Bookmark
              • Record: found
              • Abstract: not found
              • Book Chapter: not found

              Deep Features for Text Spotting

                Bookmark

                Author and article information

                Journal
                2016-11-18
                Article
                1611.06159
                748e6b00-c4d3-42ae-9a69-a3d681afe0af

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                35 pages
                cs.CV

                Computer vision & Pattern recognition
                Computer vision & Pattern recognition

                Comments

                Comment on this article