17
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Deep Learning for Computer Vision: A Brief Review

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Over the last years deep learning methods have been shown to outperform previous state-of-the-art machine learning techniques in several fields, with computer vision being one of the most prominent cases. This review paper provides a brief overview of some of the most significant deep learning schemes used in computer vision problems, that is, Convolutional Neural Networks, Deep Boltzmann Machines and Deep Belief Networks, and Stacked Denoising Autoencoders. A brief account of their history, structure, advantages, and limitations is given, followed by a description of their applications in various computer vision tasks, such as object detection, face recognition, action and activity recognition, and human pose estimation. Finally, a brief overview is given of future directions in designing deep learning schemes for computer vision problems and the challenges involved therein.

          Related collections

          Most cited references 92

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          DeepFace: Closing the Gap to Human-Level Performance in Face Verification

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            FaceNet: A Unified Embedding for Face Recognition and Clustering

             ,  ,   (2015)
            Despite significant recent advances in the field of face recognition, implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as feature vectors. Our method uses a deep convolutional network trained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches. To train, we use triplets of roughly aligned matching / non-matching face patches generated using a novel online triplet mining method. The benefit of our approach is much greater representational efficiency: we achieve state-of-the-art face recognition performance using only 128-bytes per face. On the widely used Labeled Faces in the Wild (LFW) dataset, our system achieves a new record accuracy of 99.63%. On YouTube Faces DB it achieves 95.12%. Our system cuts the error rate in comparison to the best published result by 30% on both datasets. We also introduce the concept of harmonic embeddings, and a harmonic triplet loss, which describe different versions of face embeddings (produced by different networks) that are compatible to each other and allow for direct comparison between each other.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              One-shot learning of object categories.

              Learning visual models of object categories notoriously requires hundreds or thousands of training examples. We show that it is possible to learn much information about a category from just one, or a handful, of images. The key insight is that, rather than learning from scratch, one can take advantage of knowledge coming from previously learned categories, no matter how different these categories might be. We explore a Bayesian implementation of this idea. Object categories are represented by probabilistic models. Prior knowledge is represented as a probability density function on the parameters of these models. The posterior model for an object category is obtained by updating the prior in the light of one or more observations. We test a simple implementation of our algorithm on a database of 101 diverse object categories. We compare category models learned by an implementation of our Bayesian approach to models learned from by Maximum Likelihood (ML) and Maximum A Posteriori (MAP) methods. We find that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully.
                Bookmark

                Author and article information

                Contributors
                Journal
                Comput Intell Neurosci
                Comput Intell Neurosci
                CIN
                Computational Intelligence and Neuroscience
                Hindawi
                1687-5265
                1687-5273
                2018
                1 February 2018
                : 2018
                Affiliations
                1Department of Informatics, Technological Educational Institute of Athens, 12210 Athens, Greece
                2National Technical University of Athens, 15780 Athens, Greece
                Author notes

                Academic Editor: Diego Andina

                Article
                10.1155/2018/7068349
                5816885
                Copyright © 2018 Athanasios Voulodimos et al.

                This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                Funding
                Funded by: State Scholarships Foundation
                Funded by: European Social Fund
                Funded by: Greek national funds
                Categories
                Review Article

                Neurosciences

                Comments

                Comment on this article