Predicting human intention in visual observations of hand/object interactions

Published in ICRA, 2013

  1. Full citation

    Song, D., Kyriazis, N., Oikonomidis, I., Papazov, C., Argyros, A. A., Burschka, D., & Kragic, D. (2013). Predicting human intention in visual observations of hand/object interactions. ICRA, 1608–1615. https://doi.org/10.1109/ICRA.2013.6630785

    Abstract

    The main contribution of this paper is a probabilistic method for predicting human manipulation intention from image sequences of human-object interaction. Predicting intention amounts to inferring the imminent manipulation task when human hand is observed to have stably grasped the object. Inference is performed by means of a probabilistic graphical model that encodes object grasping tasks over the 3D state of the observed scene. The 3D state is extracted from RGB-D image sequences by a novel vision-based, markerless hand-object 3D tracking framework. To deal with the high-dimensional state-space and mixed data types (discrete and continuous) involved in grasping tasks, we introduce a generative vector quantization method using mixture models and self-organizing maps. This yields a compact model for encoding of grasping actions, able of handling uncertain and partial sensory data. Experimentation showed that the model trained on simulated data can provide a potent basis for accurate goal-inference with partial and noisy observations of actual real-world demonstrations. We also show a grasp selection process, guided by the inferred human intention, to illustrate the use of the system for goal-directed grasp imitation.

    Presentation video