{"title": "Learning Global Direct Inverse Kinematics", "book": "Advances in Neural Information Processing Systems", "page_first": 589, "page_last": 595, "abstract": null, "full_text": "Learning Global Direct Inverse Kinematics \n\nDavid DeMers\u00b7 \n\nComputer Science & Eng. \n\nUC San Diego \n\nLa Jolla, CA 92093-0114 \n\nKenneth Kreutz-Delgado t \nElectrical & Computer Eng. \n\nUC San Diego \n\nLa Jolla, CA 92093-0407 \n\nAbstract \n\nWe introduce and demonstrate a bootstrap method for construction of an in(cid:173)\nverse function for the robot kinematic mapping using only sample configuration(cid:173)\nspace/workspace data. Unsupervised learning (clustering) techniques are used on \npre-image neighborhoods in order to learn to partition the configuration space \ninto subsets over which the kinematic mapping is invertible. Supervised leam(cid:173)\ning is then used separately on each of the partitions to approximate the inverse \nfunction. The ill-posed inverse kinematics function is thereby regularized, and \na globa1 inverse kinematics solution for the wristless Puma manipulator is devel(cid:173)\noped. \n\n1 \n\nINTRODUCTION \n\nThe robot forward kinematics function is a continuous mapping \n\nf : C ~ en - w ~ Xm \n\nwhich maps a set of n joint parameters from the configuration space, C, to the m(cid:173)\ndimensiona1 task space, W. If m S n, the robot has redundant degrees-of-freedom \n(dof's). In general, control objectives such as the positioning and orienting of the end(cid:173)\neffector are specified with respect to task space co-ordinates; however, the manipulator is \ntypica1ly controlled only in the configuration space. Therefore, it is important to be able \nto find some 0 E C such that f(O) is a particular target va1ue xo E W. This is the inverse \nkinematics problem. \n\n\u2022 e-mail: demers@cs.ucsd.edu \nt e-mail: kreutz@ece.ucsd.edu \n\n589 \n\n\f590 \n\nDeMers and Kreutz-Delgado \n\nThe inverse kinematics problem is ill-posed. If there are redundant doCs then the problem is \nlocally ill-posed, because the solution is non-unique and consists of a non-trivial manifold1 \nin C. With or without redundant dof's, the problem is generally globally ill-posed because \nof the existence of a finite set of solution branches -\nthere will typically be multiple \nconfigurations which result in the same task space location. Thus computation of a direct \ninverse is problematic due to the many-to--one nature (and therefore non-invertibility) of \nthe map I . \nThe inverse problem can be solved explicitly, that is, in closed form, for only certain kinds \nof manipulators. E.g. six dof elbow manipulators with separable wrist (where the first three \njoints are used for positioning and the last three have a common origin and are used for \norientation), such as the Puma 560, are solvable, see (Craig, 86). The alternative to a closed \nform solution is a numerical solution, usually either using the inverse of the Jacobian, which \nis a Newton-style approach, or by using gradient descent (also a Jacobian-based method). \nThese methods are iterative and require expensive Jacobian or gradient computation at each \nstep, thus they are not well-suited for real-time control. \n\nNeural networks can be used to find an inverse by implementing either direct inverse \nmodeling (estimating the explicit function 1-1) or differential methods. Implementations \nof the direct inverse approach typically fail due to the non-linearity of the solution sef, \nor resolve this problem by restriction to a single solution a priori. However, such a prior \nrestriction of the solutions may not be possible or acceptable in all circumstances, and may \ndrastically reduce the dexterity and manipulability of the arm. \n\nThe differential approaches either find only the nearest local solution, or resolve the multi(cid:173)\nplicity of solutions at training time, as with Jordan's forward modeling (Jordan & Rumelhart, \n1990) or the approach of (Nguyen & Patel, 1990). We seek to regularize the mapping in \nsuch a way that all possible solutions are available at run-time, and can be computed \nefficiently as a direct constant-time inverse rather than approximated by slower iterative \ndifferential methods. To achieve the fast run-time solution, a significant cost in training \ntime must be paid; however, it is not unreasonable to invest resources in off-line learning \nin order to attain on-line advantages. Thus we wish to gain the run-time computational \nefficiency of a direct inverse solution while also achieving the benefits of the differential \napproaches. \n\nThis paper introduces a method for performing global regularization; that is, identifying \nthe complete, finite set of solutions to the inverse kinematics problem for a non-redundant \nmanipulator. This will provide the ability to choose a particular solution at run time. \nResolving redundancy is beyond the scope of this paper, however, preliminary work on \na method which may be integrated with the work presented here is shown in (DeMers \n& Kreutz-Delgado, 1991). In the remainder of this paper it will be assumed that the \nmanipulator does not have redundant dof's. It will also be assumed that all of the joints are \nrevolute, thus the configuration space is a subset of the n-torus, Tn. \n\nIGenerically of dimensionality equal to n - m. \nZrhe target values are assumed to be in the range of I, i E W = I(C), so the existence of a \n\nsolution is not an issue in this paper. \n\n3Training a network to minimize mean squared error with multiple target values for the same \ninput value results in a \"learned\" response of the average of the targets. Since the targets lie on a \nnumber of non-linear manifolds (for the redundant case) or consist of a finite number of points (for \nthe non-redundant case), the average of multiple targets will typically not be a correct target. \n\n\fLearning Global Direct Inverse Kinematics \n\n591 \n\n2 TOPOLOGY OF THE KINEMATICS FUNCTION \n\nThe kinematics mapping is continuous and smooth and, generically, neighborhoods in \nconfiguration space map to neighborhoods in the task space4\u2022 The configuration space, \nC, is made up of a finite number of disjoint regions or partitions, separated by n - 1 \ndimensional surfaces where the Jacobian loses rank (called critical surfaces), see (Burdick, \n1988, Burdick, 1991). \nLet I : Tn -- Rn be the kinematic mapping. Then \n\nW = I(C) = U Ii (Cd \n\nk \n\ni=l \n\nwhere Ii is the restriction of I to Ci , Ii : Ci = en /1 -- Rn and the factor space en /1 is \nlocally diffeomorphic to Rn. The Ci are each a connected region such that \n\nVOECi , det(J(O));tO \n\nwhere J is the Jacobian of I, J == d(J!. Define Wi as I(Cd. Generically, Ii is one-to-one \nand onto open neighborhoods of Wi 5 , thus by the inverse function theorem \n:3 gi(X) = I j- 1 : Wi -- Ci, such that I 0 gi(X) = X, Vx E Wi \n\nIn the general case, with redundant dof's, the kinematics over a single configuration-space \nregion can be viewed as a fiber bundle, where the fibers are homeomorphic to Tn-m. \nThe base space is the reachable workspace (the image of Ci under j). Solution branch \nresolution can be done by identifying distinct connected open coordinate neighborhoods of \nthe configuration space which cover the workspace. Redundancy resolution can be done by \na consistent parameterization of the fibers within each neighborhood. In the case at hand, \nwithout redundant dof's, the \"fibers\" are singleton sets and no resolution is needed. \n\nIn the remainder of this paper, we will use input/output data to identify the individual \nregions, Ci, of a non-redundant manipulator, over which the mapping Ii : Ci -- Wi is \ninvertible. The input/output data will then be partitioned modulo the configuration regions \nCi, and each li- 1 approximated individually. \n\n3 SAMPLING APPROACH \n\nIf the manipulator can be measured and a large sample of (0, i) pairs taken, stored such \nthat the x samples can be searched efficiently, a rough estimate of the inverse solutions at \na particular target point io may be obtained by finding all of the 0 points whose image lies \nwithin some ( of xo. The pre-image of this (-ball will generically consist of several distinct \n(distorted) balls in the configuration space. If the sampling is adequate then there will be \none such ball for each of the inverse solution branches. If each of the the points in each ball \nis given a label for the solution branch, the labeled data may then be used for supervised \n\n4This property fails when the manipulator is in a singular configuration, at which the Jacobian, \n\ndeft loses rank. \n\nsSince it is generically true that J is non-singular. \n\n\f592 \n\nDeMers and Kreutz-Delgado \n\nlearning of a classifier of solution branches in the configuration space. In this way we will \nhave \"bootstrapped\" our way to the development of a solution branch classifier. \nTaking advantage of the continuous nature of the forward mapping, note that if io is slightly \nperturbed by a \"jump\" to a neighboring target point then the pre-image balls will also be \nperturbed. We can assign labels to the new data consistent with labels already assigned \nto the previous data, by computing the distances between the new, unlabeled balls and the \npreviously labeled balls. Continuing in this fashion, io traces a path through the entire \nworkspace and solution branch labels may be given to all points in C which map to within \nf of one of the selected i points along the sweep. \nThis procedure results in a significant and representative proportion of the data now being \nlabeled as to solution branch. Thus we now have labeled data (if, i, B( en, where \n8( e) = {I, ... , k} indicates which of the k solution branches, Ci , the point e is in. We \ncan now construct a classifier using supervised learning to compute the branches B( e) for \na given B. Once an estimate of B( 0) is developed, we may use it to classify large amounts \nof (if, i) data, and partition the data into k sets, one for each of the solution branches, Ci. \n\n4 RESOLUTION OF SOLUTION BRANCHES \n\nWe applied the above to the wristIess Puma 560, a 3-R manipulator for end-effector \npositioning in R3. We took 40,000 samples of (if, i) points, and examined all points within \nlOcm of selected target values ii. The ii formed a grid of 90 locations in the workspace. \n3,062 of the samples fell within 10 cm of one of the ii. The configuration space points for \neach target ii were clustered into four groups, corresponding to the four possible solution \nbranches of the wristless Puma 560. About 3% of the points were clustered into the wrong \ngroup, based on the labeling scheme used. These 3,062 points were then used as training \npatterns for a feedforward neural network classifier. A point was classified into the group \nassociated with the output unit of the neural network with maximum activation. The output \nvalues were normalized to sum to 1.0. The network was tested on 50,000 new, previously \nunseen (if, i) pairs, and correctly classified more than 98% of them. \nAll of the erroneous classifications were for points near the critical surfaces. Therefore the \nactivation levels of the output units can be used to estimate closeness to a critical surface. \nExamining the test data and assigning all 0 points for which no output unit has activation \ngreater than or equal to 0.8 to the \"near-a-singularity\" class, the remaining points were \n100% correctly classified. \n\nFigure 1 shows the true critical manifold separating the regions of configuration space, and \nthe estimated manifold consisting of points from the test set where the maximum activation \nof output units of the trained neural network is less than 0.8. The configuration space is a \nsubset of the 3-torus, which is shown here \"sliced\" along three generators and represented \nas a cube. Because the Puma 560 has physiCal limits on the range of motion of its joints, the \nregions shown are in fact six distinct regions, and there is no wraparound in any direction. \nThis classifier network is our candidate for an estimate of B( e). With it, the samples can \nbe separated into groups corresponding to the domains of each of the Ii, thus regularizing \ninto k = 6 one-to-one invertible pieces6 \u2022 \n\n6 Although there are only four inverse solutions for any i. If there were no joint limits, then the \n\n\fLearning Global Direcr Inverse Kinemarics \n\n593 \n\nJoint 2 \n\nJoint 2 \n\nFigure 1: The analytically derived critical surfaces, along with J ,000 points/or which no \nunit 0/ the neural network classifier has greater than 0.8 activation. \n\n5 DIRECT INVERSE SOLUTIONS \n\nThe classifier neural network can now be used to partition the data into four groups, one for \neach of the branches, Ci . For each of these data sets, we train a feedforward network to learn \nthe mapping in the inverse direction. The target vectors were represented as vectors of the \nsine of the half-angle (a measure motivated by the quatemion representation of orientation). \nMSE under 0.001 were achieved for each of the four. This looks like a very small error, \nhowever, this error is somewhat misleading. The configuration space error is measured in \nunits which are difficult to interpret. More important is the error in the workspace when \nthe solution computed is used in the forward kinematics mapping to position the ann. Over \na test set of 4,000 points, the average positioning error was 5.2 cm over the 92 cm radius \nworkspace. \n\nWe have as yet made no attempts to optimize the network or training for the direct inverse; \nthe thrust of our work is in achieving the regularization. It is clear that substantially better \nperformance can be developed, for example, by following (Ritter, et al., 1989), and we \nexpect end-effector positioning errors of less than 1 % to be easily achievable. \n\n6 DISCUSSION \n\nWe have shown that by exploiting the topological property of continuity of the kinematic \nmapping for a non-redundant 3-dof robot we can determine all of the solution regions of the \ninverse kinematic mapping. We have mapped out the configuration space critical surfaces \nand thus discovered an important topological property of the mapping, corresponding to \nan important physical property of the manipulator, by unsupervised learning. We can \nboostrap from the original input/output data, unlabeled as to solution branch, and construct \nan accurate classifier for the entire configuration space. The data can thereby be partitioned \ninto sets which are individually one-t()-{)ne and invenible, and the inverse mapping can \nbe directly approximated for each. Thus a large learning-time investment results in a fast \nrun-time direct inverse kinematics solution. \n\ncube shown would be a true 3-torus, with opposite faces identified. Thus the small pieces in the \ncorners would be part of the larger regions by wraparound in the Joint 2 direction. \n\n\f594 \n\nDeMers and Kreutz-Delgado \n\nWe need a thorough sampling of the configuration space in order to ensure that enough \npoints will fall within each f-ball, thus the data requirements are clearly exponential in the \nnumber of degrees of freedom of the manipulator. Even with efficient storage and retrieval \nin geometric data structures, such as a k-d tree, high dimensional systems may not be \ntractable by our methods. \n\nFortunately practical and useful robotic systems of six and seven degrees of freedom \nshould be amenable to this method, especially if separable into positioning and orienting \nsubsystems. \n\nAcknowledgements \n\nThis work was supported in part by NSF Presidential Young Investigator award IRI-\n9057631 and a NASA/Netrologic grant. The first author would like to thank NIPS for \nproviding student travel grants. We thank Gary Cottrell for his many helpful comments and \nenthusiastic discussions. \n\nReferences \n\nJoel Burdick (1991), \"A Classification of 3R Regional Manipulator Singularities and Ge(cid:173)\nometries\", Proc. 19911\u00a3E\u00a3 Inti. Con! Robotics & Automation, Sacramento. \n\nJoel Burdick (1988), \"Kinematics and Design of Redundant Robot Manipulators\", Stanford \nPh.D. Thesis, Dept. of Mechanical Engineering. \n\nJohn Craig (1986), Introduction to Robotics, Addison-Wesley. \n\nDavid DeMers & Kenneth Kreutz-Delgado (1991), \"Learning Global Topological Proper(cid:173)\nties of Robot Kinematic Mappings for Neural Network-Based Configuration Control\", in \nBekey, ed. Proc. USC Workshop on Neural Networks in Robotics, (to appear). \n\nMichael I. Jordan (1988), \"Supervised Learning and Systems with Excess Degrees of \nFreedom\", COINS Technical Report 88-27, University of Massachusetts at Amherst. \n\nMichael!. Jordan & David E. Rumelhart (1990), \"Forward Models: Supervised Learning \nwith a Distal Teacher\". Submitted to Cognitive Science. \n\nL. Nguyen & R.V. Patel (1990), \"A Neural Network Based Strategy for the Inverse Kine(cid:173)\nmatics Problem in Robotics\", in Jamshidi and Saif, eds., Robotics and Manufacturing: \nrecent Trends in Research, Education and Applications, vol. 3, pp. 995-1000 (ASME \nPress). \n\nHelge J. Ritter, Thomas M. Martinetz, & Klaus J. Schulten (1989), ''Topology-Conserving \nMaps for Learning Visuo-Motor-Coordination\", Neural Networks, Vol. 2, pp. 159-168. \n\n\f", "award": [], "sourceid": 542, "authors": [{"given_name": "David", "family_name": "DeMers", "institution": null}, {"given_name": "Kenneth", "family_name": "Kreutz-Delgado", "institution": null}]}