224
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      3D Convolutional Neural Networks for Human Action Recognition

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We consider the automated recognition of human actions in surveillance videos. Most current methods build classifiers based on complex handcrafted features computed from the raw inputs. Convolutional neural networks (CNNs) are a type of deep model that can act directly on the raw inputs. However, such models are currently limited to handling 2D inputs. In this paper, we develop a novel 3D CNN model for action recognition. This model extracts features from both the spatial and the temporal dimensions by performing 3D convolutions, thereby capturing the motion information encoded in multiple adjacent frames. The developed model generates multiple channels of information from the input frames, and the final feature representation combines information from all channels. To further boost the performance, we propose regularizing the outputs with high-level features and combining the predictions of a variety of different models. We apply the developed models to recognize human actions in the real-world environment of airport surveillance videos, and they achieve superior performance in comparison to baseline methods.

          Related collections

          Most cited references54

          • Record: found
          • Abstract: not found
          • Article: not found

          Gradient-based learning applied to document recognition

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Reducing the dimensionality of data with neural networks.

            High-dimensional data can be converted to low-dimensional codes by training a multilayer neural network with a small central layer to reconstruct high-dimensional input vectors. Gradient descent can be used for fine-tuning the weights in such "autoencoder" networks, but this works well only if the initial weights are close to a good solution. We describe an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Bagging predictors

                Bookmark

                Author and article information

                Journal
                IEEE Transactions on Pattern Analysis and Machine Intelligence
                IEEE Trans. Pattern Anal. Mach. Intell.
                Institute of Electrical and Electronics Engineers (IEEE)
                0162-8828
                2160-9292
                January 2013
                January 2013
                : 35
                : 1
                : 221-231
                Article
                10.1109/TPAMI.2012.59
                22392705
                3d1c68e4-8ffb-4ab5-a6dc-fadce414db25
                © 2013

                https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html

                History

                Comments

                Comment on this article