164
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Bilinear CNN Models for Fine-grained Visual Recognition

      Preprint
      , ,

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We present bilinear CNNs, an architecture that efficiently represents an image as a pooled outer product of two CNN features, that is effective at fine-grained recognition tasks. These models capture localized part-feature interactions similar to those in part-based models, but can also be seen as an orderless texture representation. Based on this observation we derive a family of end-to-end trainable bilinear models that generalize classical image representations, such as the second-order pooling, Fisher-vectors, vector-of-locally-aggregated descriptors, and bag-of-visual-words. This allows domain-specific fine-tuning and visualization of the learned models by approximate inversion. Through a number of experiments we show that these models offer better accuracy, speed, and memory trade-offs compared to prior work on various fine-grained, texture, and scene recognition datasets. The source code for the complete system is available at http://vis-www.cs.umass.edu/bcnn

          Related collections

          Author and article information

          Journal
          2015-04-29
          2016-11-28
          Article
          1504.07889
          66f61b6d-77e1-45ad-9e68-c42c356d8c84

          http://arxiv.org/licenses/nonexclusive-distrib/1.0/

          History
          Custom metadata
          cs.CV

          Computer vision & Pattern recognition
          Computer vision & Pattern recognition

          Comments

          Comment on this article