92
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Deep Learning for Computer Vision: A Brief Review

      review-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Over the last years deep learning methods have been shown to outperform previous state-of-the-art machine learning techniques in several fields, with computer vision being one of the most prominent cases. This review paper provides a brief overview of some of the most significant deep learning schemes used in computer vision problems, that is, Convolutional Neural Networks, Deep Boltzmann Machines and Deep Belief Networks, and Stacked Denoising Autoencoders. A brief account of their history, structure, advantages, and limitations is given, followed by a description of their applications in various computer vision tasks, such as object detection, face recognition, action and activity recognition, and human pose estimation. Finally, a brief overview is given of future directions in designing deep learning schemes for computer vision problems and the challenges involved therein.

          Related collections

          Most cited references92

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          DeepFace: Closing the Gap to Human-Level Performance in Face Verification

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            One-shot learning of object categories.

            Learning visual models of object categories notoriously requires hundreds or thousands of training examples. We show that it is possible to learn much information about a category from just one, or a handful, of images. The key insight is that, rather than learning from scratch, one can take advantage of knowledge coming from previously learned categories, no matter how different these categories might be. We explore a Bayesian implementation of this idea. Object categories are represented by probabilistic models. Prior knowledge is represented as a probability density function on the parameters of these models. The posterior model for an object category is obtained by updating the prior in the light of one or more observations. We test a simple implementation of our algorithm on a database of 101 diverse object categories. We compare category models learned by an implementation of our Bayesian approach to models learned from by Maximum Likelihood (ML) and Maximum A Posteriori (MAP) methods. We find that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Face recognition: a convolutional neural-network approach.

              We present a hybrid neural-network for human face recognition which compares favourably with other methods. The system combines local image sampling, a self-organizing map (SOM) neural network, and a convolutional neural network. The SOM provides a quantization of the image samples into a topological space where inputs that are nearby in the original space are also nearby in the output space, thereby providing dimensionality reduction and invariance to minor changes in the image sample, and the convolutional neural network provides partial invariance to translation, rotation, scale, and deformation. The convolutional network extracts successively larger features in a hierarchical set of layers. We present results using the Karhunen-Loeve transform in place of the SOM, and a multilayer perceptron (MLP) in place of the convolutional network for comparison. We use a database of 400 images of 40 individuals which contains quite a high degree of variability in expression, pose, and facial details. We analyze the computational complexity and discuss how new classes could be added to the trained recognizer.
                Bookmark

                Author and article information

                Contributors
                Journal
                Comput Intell Neurosci
                Comput Intell Neurosci
                CIN
                Computational Intelligence and Neuroscience
                Hindawi
                1687-5265
                1687-5273
                2018
                1 February 2018
                : 2018
                : 7068349
                Affiliations
                1Department of Informatics, Technological Educational Institute of Athens, 12210 Athens, Greece
                2National Technical University of Athens, 15780 Athens, Greece
                Author notes

                Academic Editor: Diego Andina

                Author information
                http://orcid.org/0000-0002-0632-9769
                Article
                10.1155/2018/7068349
                5816885
                29487619
                19888054-a459-41d9-8bde-b0c48dd9f21e
                Copyright © 2018 Athanasios Voulodimos et al.

                This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 17 June 2017
                : 27 November 2017
                Funding
                Funded by: State Scholarships Foundation
                Funded by: European Social Fund
                Funded by: Greek national funds
                Categories
                Review Article

                Neurosciences
                Neurosciences

                Comments

                Comment on this article