8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Learning Social Affordance Grammar from Videos: Transferring Human Interactions to Human-Robot Interactions

      Preprint
      , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In this paper, we present a general framework for learning social affordance grammar as a spatiotemporal AND-OR graph (ST-AOG) from RGB-D videos of human interactions, and transfer the grammar to humanoids to enable a real-time motion inference for human-robot interaction (HRI). Based on Gibbs sampling, our weakly supervised grammar learning can automatically construct a hierarchical representation of an interaction with long-term joint sub-tasks of both agents and short term atomic actions of individual agents. Based on a new RGB-D video dataset with rich instances of human interactions, our experiments of Baxter simulation, human evaluation, and real Baxter test demonstrate that the model learned from limited training data successfully generates human-like behaviors in unseen scenarios and outperforms both baselines.

          Related collections

          Most cited references6

          • Record: found
          • Abstract: not found
          • Article: not found

          Learning Object Affordances: From Sensory--Motor Coordination to Imitation

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Discriminative latent models for recognizing contextual group activities.

            In this paper, we go beyond recognizing the actions of individuals and focus on group activities. This is motivated from the observation that human actions are rarely performed in isolation; the contextual information of what other people in the scene are doing provides a useful cue for understanding high-level activities. We propose a novel framework for recognizing group activities which jointly captures the group activity, the individual person actions, and the interactions among them. Two types of contextual information, group-person interaction and person-person interaction, are explored in a latent variable framework. In particular, we propose three different approaches to model the person-person interaction. One approach is to explore the structures of person-person interaction. Differently from most of the previous latent structured models, which assume a predefined structure for the hidden layer, e.g., a tree structure, we treat the structure of the hidden layer as a latent variable and implicitly infer it during learning and inference. The second approach explores person-person interaction in the feature level. We introduce a new feature representation called the action context (AC) descriptor. The AC descriptor encodes information about not only the action of an individual person in the video, but also the behavior of other people nearby. The third approach combines the above two. Our experimental results demonstrate the benefit of using contextual information for disambiguating group activities.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Stochastic Representation and Recognition of High-Level Group Activities

                Bookmark

                Author and article information

                Journal
                2017-03-01
                Article
                1703.00503
                5645d2ed-1d5d-42c0-8195-12b17eda9c12

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                The 2017 IEEE International Conference on Robotics and Automation (ICRA)
                cs.RO cs.AI cs.CV

                Computer vision & Pattern recognition,Robotics,Artificial intelligence
                Computer vision & Pattern recognition, Robotics, Artificial intelligence

                Comments

                Comment on this article