5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Grid R-CNN

      Preprint
      , , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          This paper proposes a novel object detection framework named Grid R-CNN, which adopts a grid guided localization mechanism for accurate object detection. Different from the traditional regression based methods, the Grid R-CNN captures the spatial information explicitly and enjoys the position sensitive property of fully convolutional architecture. Instead of using only two independent points, we design a multi-point supervision formulation to encode more clues in order to reduce the impact of inaccurate prediction of specific points. To take the full advantage of the correlation of points in a grid, we propose a two-stage information fusion strategy to fuse feature maps of neighbor grid points. The grid guided localization approach is easy to be extended to different state-of-the-art detection frameworks. Grid R-CNN leads to high quality object localization, and experiments demonstrate that it achieves a 4.1% AP gain at IoU=0.8 and a 10.0% AP gain at IoU=0.9 on COCO benchmark compared to Faster R-CNN with Res50 backbone and FPN architecture.

          Related collections

          Most cited references5

          • Record: found
          • Abstract: not found
          • Book Chapter: not found

          Microsoft COCO: Common Objects in Context

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            The Pascal Visual Object Classes Challenge: A Retrospective

              Bookmark
              • Record: found
              • Abstract: not found
              • Book Chapter: not found

              Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

                Bookmark

                Author and article information

                Journal
                29 November 2018
                Article
                1811.12030
                3e2d794e-2488-4627-90c5-9bde41c48e25

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                cs.CV

                Computer vision & Pattern recognition
                Computer vision & Pattern recognition

                Comments

                Comment on this article