16
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Deep CNN-based Speech Balloon Detection and Segmentation for Comic Books

      Preprint
      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We develop a method for the automated detection and segmentation of speech balloons in comic books, including their carrier and tails. Our method is based on a deep convolutional neural network that was trained on annotated pages of the Graphic Narrative Corpus. More precisely, we are using a fully convolutional network approach inspired by the U-Net architecture, combined with a VGG-16 based encoder. The trained model delivers state-of-the-art performance with an F1-score of over 0.94. Qualitative results suggest that wiggly tails, curved corners, and even illusory contours do not pose a major problem. Furthermore, the model has learned to distinguish speech balloons from captions. We compare our model to earlier results and discuss some possible applications.

          Related collections

          Most cited references14

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Sketch-based manga retrieval using manga109 dataset

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            ORB An efficient alternative to SIFT or SURF

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              eBDtheque: A Representative Database of Comics

                Bookmark

                Author and article information

                Journal
                21 February 2019
                Article
                1902.08137
                126d65fb-5eae-4426-976b-c10f78124d1f

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                68T45 (Primary) 68T05, 91E30 (Secondary)
                10 pages, 5 figures, 2 tables
                cs.CV cs.LG q-bio.NC

                Computer vision & Pattern recognition,Neurosciences,Artificial intelligence

                Comments

                Comment on this article