2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      HROM: Learning High-Resolution Representation and Object-Aware Masks for Visual Object Tracking

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Siamese network-based trackers consider tracking as features cross-correlation between the target template and the search region. Therefore, feature representation plays an important role for constructing a high-performance tracker. However, all existing Siamese networks extract the deep but low-resolution features of the entire patch, which is not robust enough to estimate the target bounding box accurately. In this work, to address this issue, we propose a novel high-resolution Siamese network, which connects the high-to-low resolution convolution streams in parallel as well as repeatedly exchanges the information across resolutions to maintain high-resolution representations. The resulting representation is semantically richer and spatially more precise by a simple yet effective multi-scale feature fusion strategy. Moreover, we exploit attention mechanisms to learn object-aware masks for adaptive feature refinement, and use deformable convolution to handle complex geometric transformations. This makes the target more discriminative against distractors and background. Without bells and whistles, extensive experiments on popular tracking benchmarks containing OTB100, UAV123, VOT2018 and LaSOT demonstrate that the proposed tracker achieves state-of-the-art performance and runs in real time, confirming its efficiency and effectiveness.

          Related collections

          Most cited references61

          • Record: found
          • Abstract: not found
          • Article: not found

          Object Detection With Deep Learning: A Review

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Very Deep Convolutional Networks for Large-Scale Image Recognition

            , (2014)
            In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Object Tracking Benchmark.

              Object tracking has been one of the most important and active research areas in the field of computer vision. A large number of tracking algorithms have been proposed in recent years with demonstrated success. However, the set of sequences used for evaluation is often not sufficient or is sometimes biased for certain types of algorithms. Many datasets do not have common ground-truth object positions or extents, and this makes comparisons among the reported quantitative results difficult. In addition, the initial conditions or parameters of the evaluated tracking algorithms are not the same, and thus, the quantitative results reported in literature are incomparable or sometimes contradictory. To address these issues, we carry out an extensive evaluation of the state-of-the-art online object-tracking algorithms with various evaluation criteria to understand how these methods perform within the same framework. In this work, we first construct a large dataset with ground-truth object positions and extents for tracking and introduce the sequence attributes for the performance analysis. Second, we integrate most of the publicly available trackers into one code library with uniform input and output formats to facilitate large-scale performance evaluation. Third, we extensively evaluate the performance of 31 algorithms on 100 sequences with different initialization settings. By analyzing the quantitative results, we identify effective approaches for robust tracking and provide potential future research directions in this field.
                Bookmark

                Author and article information

                Journal
                Sensors (Basel)
                Sensors (Basel)
                sensors
                Sensors (Basel, Switzerland)
                MDPI
                1424-8220
                26 August 2020
                September 2020
                : 20
                : 17
                : 4807
                Affiliations
                Department of Computer Science, College of Mathematics and Computer Science, Zhejiang Normal University, No 688, Yingbin Road, Jinhua 321004, China; davidzhang@ 123456zjnu.edu.cn (D.Z.); 18329020065@ 123456163.com (T.W.); 201825200701@ 123456zjnu.edu.cn (Y.H.)
                Author notes
                [* ]Correspondence: zhonglong@ 123456zjnu.edu.cn
                Author information
                https://orcid.org/0000-0002-7593-1593
                https://orcid.org/0000-0002-5271-9215
                Article
                sensors-20-04807
                10.3390/s20174807
                7506602
                35c1600c-53b6-4a4f-98f9-4d9e4bb57d1f
                © 2020 by the authors.

                Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

                History
                : 23 July 2020
                : 22 August 2020
                Categories
                Article

                Biomedical engineering
                siamese network,high-resolution representation,multi-scale fusion,visual tracking,attention mechanisms,deformable convolution

                Comments

                Comment on this article