8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      An object detection algorithm combining self-attention and YOLOv4 in traffic scene

      research-article
      , * , , ,
      PLOS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Automobile intelligence is the trend for modern automobiles, of which environment perception is the key technology of intelligent automobile research. For autonomous vehicles, the detection of object information, such as vehicles and pedestrians in traffic scenes is crucial to improving driving safety. However, in the actual traffic scene, there are many special conditions such as object occlusion, small objects, and bad weather, which will affect the accuracy of object detection. In this research, the SwinT-YOLOv4 algorithm is proposed for detecting objects in traffic scenes, which is based on the YOLOv4 algorithm. Compared with a Convolutional neural network (CNN), the vision transformer is more powerful at extracting vision features of objects in the image. The CNN-based backbone in YOLOv4 is replaced by the Swin Transformer in the proposed algorithm. The feature-fusing neck and predicting head of YOLOv4 is remained. The proposed model was trained and evaluated in the COCO dataset. Experiments show that our method can significantly improve the accuracy of object detection under special conditions. Equipped with our method, the object detection precision for cars and person is improved by 1.75%, and the detection precision for car and person reach 89.04% and 94.16%, respectively.

          Related collections

          Most cited references27

          • Record: found
          • Abstract: found
          • Article: not found

          Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

          State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [1] and Fast R-CNN [2] have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features-using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model [3], our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Focal loss for dense object detection

            The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In this paper, we investigate why this is the case. We discover that the extreme foreground-background class imbalance encountered during training of dense detectors is the central cause. We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. To evaluate the effectiveness of our loss, we design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors. Code is at: https://github.com/facebookresearch/Detectron.
              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              FCOS: Fully Convolutional One-Stage Object Detection

                Bookmark

                Author and article information

                Contributors
                Role: Data curationRole: Formal analysisRole: InvestigationRole: ResourcesRole: SoftwareRole: ValidationRole: VisualizationRole: Writing – original draft
                Role: ConceptualizationRole: Funding acquisitionRole: MethodologyRole: SupervisionRole: Validation
                Role: Project administrationRole: SupervisionRole: ValidationRole: Writing – review & editing
                Role: Funding acquisitionRole: Project administrationRole: ResourcesRole: Supervision
                Role: Editor
                Journal
                PLoS One
                PLoS One
                plos
                PLOS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                18 May 2023
                2023
                : 18
                : 5
                : e0285654
                Affiliations
                [001] College of Automobile and Traffic Engineering, Nanjing Forestry University, Nanjing, 210037, China
                Menoufia University, EGYPT
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Author information
                https://orcid.org/0000-0003-1150-6640
                https://orcid.org/0000-0001-7147-0905
                Article
                PONE-D-22-31357
                10.1371/journal.pone.0285654
                10194927
                37200376
                dc002df4-0faf-4e18-b706-7750b8094377
                © 2023 Lu et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 14 November 2022
                : 27 April 2023
                Page count
                Figures: 12, Tables: 4, Pages: 18
                Funding
                Funded by: Industrial Proactive and Key Technology Program of Jiangsu Province
                Award ID: BE2022053-2
                Award Recipient :
                Funded by: Modern Agriculture-Key and General Program of Jiangsu Province
                Award ID: BE2021339
                Award Recipient :
                Funded by: Philosophy and Social Science Program of the Higher Education Institutions of Jiangsu Province
                Award ID: 2021SJA0151
                Award Recipient :
                Funded by: Science and Technology Innovation Foundation for Young Scientists of Nanjing Forestry University
                Award ID: CX2019018
                Award Recipient :
                This research was funded by the Industrial Proactive and Key Technology Program of Jiangsu Province (grant number BE2022053-2), Modern Agriculture-Key and General Program of Jiangsu Province (grant number BE2021339), Philosophy and Social Science Program of the Higher Education Institutions of Jiangsu Province (grant number 2021SJA0151) and Science and Technology Innovation Foundation for Young Scientists of Nanjing Forestry University (grant number CX2019018). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Research and Analysis Methods
                Mathematical and Statistical Techniques
                Mathematical Functions
                Convolution
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Deep Learning
                Computer and Information Sciences
                Computer Vision
                Biology and Life Sciences
                Anatomy
                Head
                Medicine and Health Sciences
                Anatomy
                Head
                Computer and Information Sciences
                Computer Vision
                Target Detection
                Computer and Information Sciences
                Neural Networks
                Biology and Life Sciences
                Neuroscience
                Neural Networks
                Biology and Life Sciences
                Anatomy
                Neck
                Medicine and Health Sciences
                Anatomy
                Neck
                Custom metadata
                The COCO2017 dataset was used in this study. The dataset can be downloaded from the Common Objects in Context website ( https://cocodataset.org/#download).

                Uncategorized
                Uncategorized

                Comments

                Comment on this article