24
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Evolution maps and applications

      PeerJ Computer Science
      PeerJ

      Read this article at

      ScienceOpenPublisher
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Common tasks in document analysis, such as binarization, line extraction etc., are still considered difficult for highly degraded text documents. Having reliable fundamental information regarding the characters of the document, such as the distribution of character dimensions and stroke width, can significantly improve the performance of these tasks. We introduce a novel perspective of the image data which maps the evolution of connected components along the change in gray scale threshold. The maps reveal significant information about the sets of elements in the document, such as characters, noise, stains, and words. The information is further employed to improve state of the art binarization algorithm, and achieve automatically character size estimation, line extraction, stroke width estimation, and feature distribution analysis, all of which are hard tasks for highly degraded documents.

          Most cited references29

          • Record: found
          • Abstract: found
          • Article: not found

          Reexamining the word length effect in visual word recognition: new evidence from the English Lexicon Project.

          In the present study, we reexamined the effect of word length (number of letters in a word) on lexical decision. Using the English Lexicon Project, which is based on a large data set of over 40,481 words (Balota et al., 2002), we performed simultaneous multiple regression analyses on a selection of 33,006 English words (ranging from 3 to 13 letters in length). Our analyses revealed an unexpected pattern of results taking the form of a U-shaped curve. The effect of number of letters was facilitatory for words of 3-5 letters, null for words of 5-8 letters, and inhibitory for words of 8-13 letters. We also showed that printed frequency, number of syllables, and number of orthographic neighbors all made independent contributions. The length effects were replicated in a new analysis of a subset of 3,833 monomorphemic nouns (ranging from 3 to 10 letters), and also in another analysis based on 12,987 bisyllabic items (ranging from 3 to 9 letters). These effects were independent of printed frequency, number of syllables, and number of orthographic neighbors. Furthermore, we also observed robust linear inhibitory effects of number of syllables. Implications for models of visual word recognition are discussed.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Page segmentation using texture analysis

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Document image binarization based on texture features

                Bookmark

                Author and article information

                Journal
                10.7717/peerj-cs.39
                http://creativecommons.org/licenses/by/4.0/

                Comments

                Comment on this article