1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Persian Optical Character Recognition Using Deep Bidirectional Long Short-Term Memory

      , , ,
      Applied Sciences
      MDPI AG

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Optical Character Recognition (OCR) is a system of converting images, including text,into editable text and is applied to various languages such as English, Arabic, and Persian. While these languages have similarities, their fundamental differences can create unique challenges. In Persian, continuity between Characters, the existence of semicircles, dots, oblique, and left-to-right characters such as English words in the context are some of the most important challenges in designing Persian OCR systems. Our proposed framework, Bina, is designed in a special way to address the issue of continuity by utilizing Convolution Neural Network (CNN) and deep bidirectional Long-Short Term Memory (BLSTM), a type of LSTM networks that has access to both past and future context. A huge and diverse dataset, including about 2M samples of both Persian and English contexts,consisting of various fonts and sizes, is also generated to train and test the performance of the proposed model. Various configurations are tested to find the optimal structure of CNN and BLSTM. The results show that Bina successfully outperformed state of the art baseline algorithm by achieving about 96% accuracy in the Persian and 88% accuracy in the Persian and English contexts.

          Related collections

          Most cited references41

          • Record: found
          • Abstract: found
          • Article: not found

          Deep learning.

          Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Deep Residual Learning for Image Recognition

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Gradient-based learning applied to document recognition

                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                Journal
                ASPCC7
                Applied Sciences
                Applied Sciences
                MDPI AG
                2076-3417
                November 2022
                November 19 2022
                : 12
                : 22
                : 11760
                Article
                10.3390/app122211760
                d8988f52-9f98-414d-8126-979480957bb1
                © 2022

                https://creativecommons.org/licenses/by/4.0/

                History

                Comments

                Comment on this article