10
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Learning to Localize and Align Fine-Grained Actions to Sparse Instructions

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Automatic generation of textual video descriptions that are time-aligned with video content is a long-standing goal in computer vision. The task is challenging due to the difficulty of bridging the semantic gap between the visual and natural language domains. This paper addresses the task of automatically generating an alignment between a set of instructions and a first person video demonstrating an activity. The sparse descriptions and ambiguity of written instructions create significant alignment challenges. The key to our approach is the use of egocentric cues to generate a concise set of action proposals, which are then matched to recipe steps using object recognition and computational linguistic techniques. We obtain promising results on both the Extended GTEA Gaze+ dataset and the Bristol Egocentric Object Interactions Dataset.

          Related collections

          Most cited references10

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          ORB: An efficient alternative to SIFT or SURF

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            A survey of robot learning from demonstration

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Is imitation learning the route to humanoid robots?

              This review investigates two recent developments in artificial intelligence and neural computation: learning from imitation and the development of humanoid robots. It is postulated that the study of imitation learning offers a promising route to gain new insights into mechanisms of perceptual motor control that could ultimately lead to the creation of autonomous humanoid robots. Imitation learning focuses on three important issues: efficient motor learning, the connection between action and perception, and modular motor control in the form of movement primitives. It is reviewed here how research on representations of, and functional connections between, action and perception have contributed to our understanding of motor acts of other beings. The recent discovery that some areas in the primate brain are active during both movement perception and execution has provided a hypothetical neural basis of imitation. Computational approaches to imitation learning are also described, initially from the perspective of traditional AI and robotics, but also from the perspective of neural network models and statistical-learning research. Parallels and differences between biological and computational approaches to imitation are highlighted and an overview of current projects that actually employ imitation learning for humanoid robots is given.
                Bookmark

                Author and article information

                Journal
                22 September 2018
                Article
                1809.08381
                f1f634b6-6f2a-4a1b-8ad1-0344cf80695c

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                cs.CV

                Computer vision & Pattern recognition
                Computer vision & Pattern recognition

                Comments

                Comment on this article