Blog
About

8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Walk and Learn: Facial Attribute Representation Learning from Egocentric Video and Contextual Data

      Preprint

      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The way people look in terms of facial attributes (ethnicity, hair color, facial hair, etc.) and the clothes or accessories they wear (sunglasses, hat, hoodies, etc.) is highly dependent on geo-location and weather condition, respectively. This work explores, for the first time, the use of this contextual information, as people with wearable cameras walk across different neighborhoods of a city, in order to learn a rich feature representation for facial attribute classification, without the costly manual annotation required by previous methods. By tracking the faces of casual walkers on more than 40 hours of egocentric video, we are able to cover tens of thousands of different identities and automatically extract nearly 5 million pairs of images connected by or from different face tracks, along with their weather and location context, under pose and lighting variations. These image pairs are then fed into a deep network that preserves similarity of images connected by the same track, in order to capture identity-related attribute features, and optimizes for location and weather prediction to capture additional facial attribute features. Finally, the network is fine-tuned with manually annotated samples. We perform an extensive experimental analysis on wearable data and two standard benchmark datasets based on web images (LFWA and CelebA). Our method outperforms by a large margin a network trained from scratch. Moreover, even without using manually annotated identity labels for pre-training as in previous methods, our approach achieves results that are better than the state of the art.

          Related collections

          Most cited references 5

          • Record: found
          • Abstract: not found
          • Article: not found

          Describable Visual Attributes for Face Verification and Image Search

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The general inefficiency of batch training for gradient descent learning.

            Gradient descent training of neural networks can be done in either a batch or on-line manner. A widely held myth in the neural network community is that batch training is as fast or faster and/or more 'correct' than on-line training because it supposedly uses a better approximation of the true gradient for its weight updates. This paper explains why batch training is almost always slower than on-line training-often orders of magnitude slower-especially on large training sets. The main reason is due to the ability of on-line training to follow curves in the error surface throughout each epoch, which allows it to safely use a larger learning rate and thus converge with less iterations through the training data. Empirical results on a large (20,000-instance) speech recognition task and on 26 other learning tasks demonstrate that convergence can be reached significantly faster using on-line training than batch training, with no apparent difference in accuracy.
              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Image ranking and retrieval based on multi-attribute queries

                Bookmark

                Author and article information

                Journal
                2016-04-21
                2016-04-27
                Article
                1604.06433

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                Custom metadata
                Paper accepted by CVPR 2016
                cs.CV

                Computer vision & Pattern recognition

                Comments

                Comment on this article