Cooking in the kitchen: Recognizing and Segmenting Human Activities in
  Videos

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

As research on action recognition matures, the focus is shifting away from categorizing basic task-oriented actions using hand-segmented video datasets to understanding complex goal-oriented daily human activities in real-world settings. Temporally structured models would seem obvious to tackle this set of problems, but so far, cases where these models have outperformed simpler unstructured bag-of-word types of models are scarce. With the increasing availability of large human activity datasets, combined with the development of novel feature coding techniques that yield more compact representations, it is time to revisit structured generative approaches. Here, we describe an end-to-end generative approach from the encoding of features to the structural modeling of complex human activities by applying Fisher vectors and temporal models for the analysis of video sequences. We systematically evaluate the proposed approach on several available datasets (ADL, MPIICooking, and Breakfast datasets) using a variety of performance metrics. Through extensive system evaluations, we demonstrate that combining compact video representations based on Fisher Vectors with HMM-based modeling yields very significant gains in accuracy and when properly trained with sufficient training samples, structured temporal models outperform unstructured bag-of-word types of models by a large margin on the tested performance metric.

Related collections

Author and article information

Journal

Publication date Created: 2015-08-25

Publication date Updated: 2016-03-17

Article

ArXiV ID: 1508.06073

SO-VID: f818d402-cb8d-44db-8fd5-afdb388b1084

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Comments 15 pages, 12 figures

Categories cs.CV

ScienceOpen disciplines: Computer vision & Pattern recognition

Data availability:

ScienceOpen disciplines: Computer vision & Pattern recognition

Cooking in the kitchen: Recognizing and Segmenting Human Activities in Videos

Read this article at

Abstract

Related collections

Recursive Rule based Visual Categorization

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 68