Enno Littmann, Andrea Drees, Helge Ritter
We report on the development of the modular neural system "SEE(cid:173) EAGLE" for the visual guidance of robot pick-and-place actions. Several neural networks are integrated to a single system that vi(cid:173) sually recognizes human hand pointing gestures from stereo pairs of color video images. The output of the hand recognition stage is processed by a set of color-sensitive neural networks to determine the cartesian location of the target object that is referenced by the pointing gesture. Finally, this information is used to guide a robot to grab the target object and put it at another location that can be specified by a second pointing gesture. The accuracy of the cur(cid:173) rent system allows to identify the location of the referenced target object to an accuracy of 1 cm in a workspace area of 50x50 cm. In our current environment, this is sufficient to pick and place arbi(cid:173) trarily positioned target objects within the workspace. The system consists of neural networks that perform the tasks of image seg(cid:173) mentation, estimation of hand location, estimation of 3D-pointing direction, object recognition, and necessary coordinate transforms. Drawing heavily on the use of learning algorithms, the functions of all network modules were created from data examples only.