2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Combination of Deep Cross-Stage Partial Network and Spatial Pyramid Pooling for Automatic Hand Detection

      ,
      Big Data and Cognitive Computing
      MDPI AG

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The human hand is involved in many computer vision tasks, such as hand posture estimation, hand movement identification, human activity analysis, and other similar tasks, in which hand detection is an important preprocessing step. It is still difficult to correctly recognize some hands in a cluttered environment because of the complex display variations of agile human hands and the fact that they have a wide range of motion. In this study, we provide a brief assessment of CNN-based object identification algorithms, specifically Densenet Yolo V2, Densenet Yolo V2 CSP, Densenet Yolo V2 CSP SPP, Resnet 50 Yolo V2, Resnet 50 CSP, Resnet 50 CSP SPP, Yolo V4 SPP, Yolo V4 CSP SPP, and Yolo V5. The advantages of CSP and SPP are thoroughly examined and described in detail in each algorithm. We show in our experiments that Yolo V4 CSP SPP provides the best level of precision available. The experimental results show that the CSP and SPP layers help improve the accuracy of CNN model testing performance. Our model leverages the advantages of CSP and SPP. Our proposed method Yolo V4 CSP SPP outperformed previous research results by an average of 8.88%, with an improvement from 87.6% to 96.48%.

          Related collections

          Most cited references61

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Deep Residual Learning for Image Recognition

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

            State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [1] and Fast R-CNN [2] have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features-using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model [3], our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.

              Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224 × 224) input image. This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a variety of CNN architectures despite their different designs. On the Pascal VOC 2007 and Caltech101 datasets, SPP-net achieves state-of-the-art classification results using a single full-image representation and no fine-tuning. The power of SPP-net is also significant in object detection. Using SPP-net, we compute the feature maps from the entire image only once, and then pool features in arbitrary regions (sub-images) to generate fixed-length representations for training the detectors. This method avoids repeatedly computing the convolutional features. In processing test images, our method is 24-102 × faster than the R-CNN method, while achieving better or comparable accuracy on Pascal VOC 2007. In ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, our methods rank #2 in object detection and #3 in image classification among all 38 teams. This manuscript also introduces the improvement made for this competition.
                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                Journal
                Big Data and Cognitive Computing
                BDCC
                MDPI AG
                2504-2289
                September 2022
                August 09 2022
                : 6
                : 3
                : 85
                Article
                10.3390/bdcc6030085
                02f6b2d9-0011-4e02-8e5e-57ce3130dcf7
                © 2022

                https://creativecommons.org/licenses/by/4.0/

                History

                Comments

                Comment on this article