{"title": "Augmented-SVM: Automatic space partitioning for combining multiple non-linear dynamics", "book": "Advances in Neural Information Processing Systems", "page_first": 1016, "page_last": 1024, "abstract": "Non-linear dynamical systems (DS) have been used extensively for building generative models of human behavior. Its applications range from modeling brain dynamics to encoding motor commands. Many schemes have been proposed for encoding robot motions using dynamical systems with a single attractor placed at a predefined target in state space. Although these enable the robots to react against sudden perturbations without any re-planning, the motions are always directed towards a single target. In this work, we focus on combining several such DS with distinct attractors, resulting in a multi-stable DS. We show its applicability in reach-to-grasp tasks where the attractors represent several grasping points on the target object. While exploiting multiple attractors provides more flexibility in recovering from unseen perturbations, it also increases the complexity of the underlying learning problem. Here we present the Augmented-SVM (A-SVM) model which inherits region partitioning ability of the well known SVM classifier and is augmented with novel constraints derived from the individual DS. The new constraints modify the original SVM dual whose optimal solution then results in a new class of support vectors (SV). These new SV ensure that the resulting multi-stable DS incurs minimum deviation from the original dynamics and is stable at each of the attractors within a finite region of attraction. We show, via implementations on a simulated 10 degrees of freedom mobile robotic platform, that the model is capable of real-time motion generation and is able to adapt on-the-fly to perturbations.", "full_text": "Augmented-SVM: Automatic space partitioning for\n\ncombining multiple non-linear dynamics\n\nAshwini Shukla\n\nAude Billard\n\nashwini.shukla@epfl.ch\n\naude.billard@epfl.ch\n\nLearning Algorithms and Systems Laboratory (LASA)\n\n\u00b4Ecole Polytechnique F\u00b4ed\u00b4erale de Lausanne (EPFL)\n\nLausanne, Switzerland - 1015\n\nAbstract\n\nNon-linear dynamical systems (DS) have been used extensively for building gen-\nerative models of human behavior. Their applications range from modeling brain\ndynamics to encoding motor commands. Many schemes have been proposed for\nencoding robot motions using dynamical systems with a single attractor placed\nat a prede\ufb01ned target in state space. Although these enable the robots to react\nagainst sudden perturbations without any re-planning, the motions are always di-\nrected towards a single target. In this work, we focus on combining several such\nDS with distinct attractors, resulting in a multi-stable DS. We show its applicabil-\nity in reach-to-grasp tasks where the attractors represent several grasping points\non the target object. While exploiting multiple attractors provides more \ufb02exibil-\nity in recovering from unseen perturbations, it also increases the complexity of\nthe underlying learning problem. Here we present the Augmented-SVM (A-SVM)\nmodel which inherits region partitioning ability of the well known SVM classi\ufb01er\nand is augmented with novel constraints derived from the individual DS. The new\nconstraints modify the original SVM dual whose optimal solution then results in a\nnew class of support vectors (SV). These new SV ensure that the resulting multi-\nstable DS incurs minimum deviation from the original dynamics and is stable at\neach of the attractors within a \ufb01nite region of attraction. We show, via implemen-\ntations on a simulated 10 degrees of freedom mobile robotic platform, that the\nmodel is capable of real-time motion generation and is able to adapt on-the-\ufb02y to\nperturbations.\n\n1 Introduction\n\nDynamical systems (DS) have proved to be a promising framework for encoding and generating\ncomplex motions. A major advantage of representing motion using DS based models [1, 2, 3, 4] is\nthe ability to counter perturbations by virtue of the fact that re-planning of trajectories is instanta-\nneous. These are generative schemes that de\ufb01ne the \ufb02ow of trajectories in state space x \u2208 RN by\nmeans of a non-linear dynamical function \u02d9x = f (x). DS with single stable attractors have been used\nin pick and place tasks to control for both the motion of the end-effector [5, 6, 7] and the placement\nof the \ufb01ngers on an object [8]. Assuming a single attractor, and hence a single grasping location on\nthe object, constrains considerably the applicability of these methods to realistic grasping problems.\nA DS composed of multiple stable attractors provides an opportunity to encode different ways to\nreach and grasp an object. Recent neuro-physiological results [9] have shown that a DS based mod-\neling best explains the trajectories followed by humans while switching between several reaching\ntargets. From a robotics viewpoint, a robot controlled using a DS with multiple attractors would\n\n1\n\n\f2.5\n\n2\n\n1.5\n\n1\n\n0.5\n\n0\n\n0.5\n\n1\n\n1.5\n\n2\n\n(a) Motion 1\n\n2.5\n\n2\n\n1.5\n\n1\n\n0.5\n\n0\n\n0.5\n\n1\n\n1.5\n\n2\n\n(b) Motion 2\n\nTraining data\nStreamlines\nAttractors\n\n2.5\n\n2\n\n1.5\n\n1\n\n0.5\n\n0\n\n0.5\n\n1\n\n1.5\n\n2\n\n2.5\n\n2\n\n1.5\n\n1\n\n0.5\n\n0\n\n0.5\n\n1\n\n1.5\n\n2\n\n(c) Crossing over\n\n(d) Fast switching\n\nFigure 2: Combining motions using naive SVM classi\ufb01cation based switching.\n\nbe able to switch online across grasping strategies. This may be useful, e.g., when one grasping\npoint becomes no longer accessible due to a sudden change in the orientation of the object or the\nappearance of an obstacle along the current trajectory. This paper presents a method by which\none can learn multiple dynamics directed toward different attractors in a single dynamical system.\nThe dynamical function f (x) is usually estimated us-\ning non-linear regression functions such as Gaussian Pro-\ncess Regression (GPR) [10], Gaussian Mixture Regres-\nsion (GMR) [7], Locally Weighted Projection Regression\n(LWPR) [11] or Dynamical Movement Primitives (DMP)\n[1]. However, all of these works modeled DS with a sin-\ngle attractor. While [7, 12] ensure global stability at the\nattractor, other approaches result in unstable DS with spu-\nrious attractors.\n\n\u22120.5\n\n1.5\n\n0.5\n\nY\n\n2\n\n0\n\n1\n\n\u22121\n\n0\n\n1\n\n\u22121\n\n\u22121.5\n\n\u22120.5\n\n\u22121.5\n\nStability at multiple targets has been addressed to date\nlargely through neural networks approaches. The Hop-\n\ufb01eld network and variants offered a powerful means to\nencode several stable attractors in the same system to pro-\nvide a form of content-addressable memory [13, 14]. The\ndynamics to reach these attractors was however not controlled for, nor was the partitioning of the\nstate space that would send the trajectories to each attractor. Echo-state networks provide alternative\nways to encode various complex dynamics [15]. Although they have proved to be universal estima-\ntors, their ability to generalize in untrained regions of state space remains unveri\ufb01ed. Also, the key\nissue of global stability of the learned dynamics is achieved using heuristic rules. To our knowledge,\nthis is the \ufb01rst attempt at learning simultaneously a partitioning of the state space and an embedding\nof multiple dynamical systems with separate regions of attractions and distinct attractors.\n\nFigure 1: 8 attractor DS\n\n0.5\n\n1.5\n\nX\n\n2 Preliminaries\n\nA naive approach to building a multi-attractor DS would be to \ufb01rst partition the space and then learn\na DS in each partition separately. This would unfortunately rarely result in the desired compound\nsystem. Consider, for instance, two DS with distinct attractors, as shown in Fig. 2(a)-(b). First, we\nbuild a SVM classi\ufb01er to separate data points of the \ufb01rst DS, labeled +1, from data points of the\nother DS, labeled \u22121. We then estimate each DS separately using any of the techniques reviewed in\nthe previous section. Let h : RN 7\u2192 R denote the classi\ufb01er function that separates the state space\nx \u2208 RN into two regions with labels yi \u2208 {+1, \u22121}. Also, let the two DS be \u02d9x = fyi(x) with\n. The combined DS is then given by \u02d9x = fsgn(h(x))(x). Figure 2(c) shows\nstable attractors at x\u2217yi\nthe trajectories resulting from this approach. Due to the non-linearity of the dynamics, trajectories\ninitialized in one region cross the boundary and converge to the attractor located in the opposite\nregion. In other words, each region partitioned by the SVM hyperplane is not a region of attraction\nfor its attractor. In a real-world scenario where the attractors represent grasping points on an object\nand the trajectories are to be followed by robots, crossing over may take the trajectories towards\nkinematically unreachable regions. Also, as shown in Fig. 2(d), trajectories that encounter the\nboundary may switch rapidly between different dynamics leading to jittery motion.\n\nTo ensure that the trajectories do not cross the boundary and remain within the region of attrac-\ntion of their respective attractors, one could adopt a more informed approach in which each of the\n\n2\n\n\foriginal DS is modulated such that the generated trajectories always move away from the classi\ufb01er\nboundary. Recall that by construction, the absolute value of the classi\ufb01er function h(x) increases as\none moves away from the classi\ufb01cation hyperplane. The gradient \u2207h(x) is hence positive, respec-\ntively negative, as one moves inside the region of the positive, respectively negative, class. We can\nexploit this observation to de\ufb02ect selective components of the velocity signal from the original DS\nalong, respectively opposite to, the direction \u2207h(x). Concretely, if \u02d9xO = fsgn(h(x))(x) denotes the\nvelocity obtained from the original DS and\n\nthe modulated dynamical system is given by\n\n\u03bb(x) = (cid:26) max(cid:0)\u01eb, \u2207h(x)T \u02d9xO(cid:1)\nmin(cid:0)\u2212\u01eb, \u2207h(x)T \u02d9xO(cid:1)\n\u02d9x = \u02dcf (x) = \u03bb(x)\u2207h(x) + \u02d9x\u22a5.\n\nif h(x) > 0\nif h(x) < 0\n\n,\n\n(1)\n\n(2)\n\nHere, \u01eb is a small positive scalar and \u02d9x\u22a5 = \u02d9xO \u2212 (cid:16) \u2207h(x)T \u02d9xO\n\nk\u2207h(xk2 (cid:17) \u2207h(x) is the compo-\n\nnent of the original velocity perpendicular to \u2207h. This results in a vector \ufb01eld that \ufb02ows\nalong increasing values of the classi\ufb01er function in the regions of space where h(x) > 0\nand along decreasing values for h(x) < 0. As a result,\nthe trajectories move away from\nthe classi\ufb01cation hyperplane and converge to a point located in the region where they were\ninitialized.\nSuch modulated systems have been used extensively for estimating stability re-\ngions of interconnected power networks [16] and are known as quasi gradient systems [17].\nIf h(x) is upper bounded1, all\ntrajectories converge to one of the stationary points {x :\n\u2207h(x) = 0} and h(x) is a Lyapunov function of the overall system (refer [17], proposition 1).\nFigure 3 shows the result of applying the above modulation to our\npair of DS. As expected, it forces the trajectories to \ufb02ow along the\ngradient of the function h(x). Although this solves the problem of\n\u201ccrossing-over\u201d the boundary, the trajectories obtained are de\ufb01cient\nin two major ways. They depart heavily from the original dynamics\nand do not terminate at the desired attractors. This is due to the fact\nthat the function h(x) used to modulate the DS was designed solely\nfor classi\ufb01cation and contained no information about the dynamics\nof the two original DS. In other words, the vector \ufb01eld given by\n\u2207h(x) was not aligned with the \ufb02ow of the training trajectories and\nthe stationary points of the modulation function did not coincide\nwith the desired attractors.\n\n0.5\n\n1.5\n\n2.5\n\n1\n\n2\n\n0\n\n2\n\n1\n\n1.5\n\n0.5\n\nIn subsequent sections, we show how we can learn a new modu-\nlation function which takes into account the three issues we high-\nlighted in this preliminary discussion. We will seek a system that\na) ensures strict classi\ufb01cation across regions of attraction (ROA)\nfor each DS, b) follows closely the dynamics of each DS in each ROA and c) ensures that all trajec-\ntories in each ROA reach the desired attractor. Satisfying requirements a) and b) above is equivalent\nto performing classi\ufb01cation and regression simultaneously. We take advantage of the fact that the\noptimization in support vector classi\ufb01cation and support vector regression have the same form to\nphrase our problem in a single constrained optimization framework. In next sections, we show that\nin addition to the usual SVM support vectors (SVs), the resulting modulation function is composed\nof an additional class of SVs. We geometrically analyze the effect of these new support vectors on\nthe resulting dynamics. While this preliminary discussion considered solely binary classi\ufb01cation,\nwe will now extend the problem to multi-class classi\ufb01cation.\n\nFigure 3: Modulated trajs.\n\n3 Problem Formulation\n\nThe N -dimensional state space of the system represented by x \u2208 RN is partitioned into M dif-\nferent classes, one for each of the M motions to be combined. We collect trajectories in the state\nspace, yielding a set of P data points {xi; \u02d9xi; li}i=1...P where li \u2208 {1, 2, \u00b7 \u00b7 \u00b7 , M } refers to the\nclass label of each point2. To learn the set of modulation functions {hm(x)}m=1...M , we pro-\nceed recursively. We learn each modulation function in a one-vs-all classi\ufb01er scheme and then\n\n1SVM classi\ufb01er function is bounded if the Radial Basis Function (rbf) is used as kernel.\n2Bold faced fonts represent vectors. xi denotes the i-th vector and xi denotes the i-th element of vector x.\n\n3\n\n\fcompute the \ufb01nal modulation function \u02dch(x) = max\nhm(x). In the multi-class setting, the be-\nm=1\u00b7\u00b7\u00b7M\nhavior of avoiding boundaries is obtained if the trajectories move along increasing values of the\nfunction \u02dch(x). To this effect, the de\ufb02ection term \u03bb(x) presented in the binary case 1 becomes\n\u03bb(x) = max(cid:16)\u01eb, \u2207\u02dch(x)T \u02d9xO(cid:17) ; \u2200x \u2208 RN . Next, we describe the procedure for learning a single\nhm(x) function.\nWe follow classical SVM formulation and lift the data into a higher dimensional feature space\nthrough the mapping \u03c6 : RN 7\u2192 RF where F denotes the dimension of the feature space. We\nalso assume that each function hm(x) is linear in feature space, i.e., hm(x) = wT \u03c6(x) + b where\nw \u2208 RF , b \u2208 R. We label the current (m \u2212 th) motion class as positive and all others negative such\nthat the set of labels for the current sub-problem is given by\n\nyi = (cid:26) +1 if li = m\n\u22121 if li 6= m\n\n;\n\ni = 1 \u00b7 \u00b7 \u00b7 P.\n\nAlso, the set indexing the positive class is then de\ufb01ned as I+ = {i : i \u2208 [1, P ]; li = m}. With this,\nwe formalize the three constraints explained in Section 2 as:\nRegion separation: Each point must be classi\ufb01ed correctly yields P constraints:\n\nyi(cid:0)wT \u03c6(xi) + b(cid:1) \u2265 1 \u2200i = 1...P.\n\n(3)\n\nLyapunov constraint: To ensure that the modulated \ufb02ow is aligned with the training trajectories,\nthe gradient of the modulation function must have a positive component along the velocities at the\ndata points. That is,\n\n\u2207hm(xi)T \u02c6\u02d9xi = wT J(xi)\u02c6\u02d9xi \u2265 0 \u2200i \u2208 I+\n\n(4)\nwhere J \u2208 RF\u00d7N is the Jacobian matrix given by J = [ \u2207\u03c61(x)\u2207\u03c62(x) \u00b7 \u00b7 \u00b7 \u2207\u03c6F (x) ]T and\n\u02c6\u02d9xi = \u02d9xi/k \u02d9xik is the normalized velocity at the i \u2212 th data point.\nStability: Lastly, the gradient of the modulation function must vanish at the attractor of the positive\nclass x\u2217. This constraint can be expressed as\n\n\u2207hm(x\u2217)T ei = wT J(x\u2217)ei = 0 \u2200i = 1...N\n\n(5)\n\nwhere the set of vectors {ei}i=1\u00b7\u00b7\u00b7N is the canonical basis of RN .\n3.1 Primal & Dual forms\n\nAs in the standard SVM [18], we optimize for maximal margin between the positive and negative\nclass, subject to constraints 3-5 above. This can be formulated as:\n\nmin\nw,\u03bei\n\n1\n2\n\nkwk2 + C Xi\u2208I+\n\n\u03bei\n\nsubject to\n\nyi(cid:0)wT \u03c6(xi) + b(cid:1) \u2265 1\nwT J(xi)\u02c6\u02d9xi + \u03bei > 0\n\u03bei > 0\nwT J(x\u2217)ei = 0\n\n\u2200i = 1 \u00b7 \u00b7 \u00b7 P\n\u2200i \u2208 I+\n\u2200i \u2208 I+\n\u2200i = 1 \u00b7 \u00b7 \u00b7 N\n\n.\n\n(6)\n\n\uf8fc\uf8f4\uf8f4\uf8fd\n\uf8f4\uf8f4\uf8fe\n\nHere \u03bei \u2208 R are slack variables that relax the Lyapunov constraint in Eq. 4. We retain these in\nour formulation to accommodate noise in the data representing the dynamics. C \u2208 R+ is a penalty\nparameter for the slack variables. The Lagrangian for the above problem can be written as\n\nL(w, b, \u03b1, \u03b2, \u03b3) =\n\n1\n2\n\nkwk2 + C Xi\u2208I+\n\nP\n\n\u03bei \u2212 Xi\u2208I+\n\u2212 Xi\u2208I+\n\n\u00b5i\u03bei \u2212\n\nXi=1\n\n\u03b1i(cid:0)yi(wT \u03c6(xi) + b) \u2212 1(cid:1)\n\n\u03b2i(cid:16)wT J(xi)\u02c6\u02d9xi + \u03bei(cid:17) +\n\nN\n\nXi=1\n\n\u03b3iwT J(x\u2217)ei\n\n(7)\n\nwhere \u03b1i, \u03b2i, \u00b5i, \u03b3i are the Lagrange multipliers with \u03b1i, \u03b2i, \u00b5i \u2208 R+ and \u03b3i \u2208 R. Employing a\nsimilar analysis as in the standard SVM, it can be shown that the corresponding dual is given by the\nconstrained quadratic program:\n\nmin\n\u03b1,\u03b2,\u03b3\n\n1\n\n2 h\u03b1T \u03b2T \u03b3 Ti\uf8ee\n\uf8f0\n\nK\nGT\n\u2212GT\n\nG\nH\n\n\u2217 \u2212HT\n\n\u2217\n\n\uf8f9\n\u2212\u03b1T 1 subject to\n\uf8fb\n\n0 \u2264 \u03b1i\n\n0 \u2264 \u03b2i \u2264 C\nPP\ni=1 \u03b1iyi = 0\n\n\u2200i = 1...P\n\u2200i \u2208 I+\n\n\u2212G\u2217\n\u2212H\u2217\nH\u2217\u2217\n\n\uf8f9\n\uf8fb\n\n\uf8ee\n\uf8f0\n\n\u03b1\n\u03b2\n\u03b3\n\n4\n\n\fwhere 1 \u2208 RP is a vector with all entries equal to one. Let k : RN \u00d7 RN 7\u2192 R represents the kernel\nfunction such that k(x1, x2) = \u03c6T (x1)\u03c6(x2). The matrices K \u2208 RP\u00d7P , G \u2208 RP\u00d7|I+|, G\u2217 \u2208\nRP\u00d7N , H \u2208 R|I+|\u00d7|I+|, H\u2217 \u2208 R|I+|\u00d7N , H\u2217\u2217 \u2208 RN\u00d7N can be expressed in terms of the kernel\nfunction and its \ufb01rst and second order derivatives:\n\n(K)ij = yiyjk(xi, xj )\n\n(G)ij = yi(cid:16) \u2202k(xi,xj)\n(G\u2217)ij = yi(cid:16) \u2202k(xi,x\n\n\u2202x\u2217\n\n\u2202xj\n\n\u2217)\n\n(cid:17)T\n(cid:17)T\n\n;\n\n;\n\n;\n\n\u02c6\u02d9xj\n\nej\n\n(H)ij = \u02c6\u02d9xT\ni\n(H\u2217)ij = \u02c6\u02d9xT\n(H\u2217\u2217)ij = eT\n\ni\n\ni\n\n\u2202 2k(xi,xj)\n\n\u2202xi\u2202xj\n\u2202 2k(xi,x\n\u2202xi\u2202x\u2217\n\n\u2217)\n\n\u2202 2k(x\n\n\u2217,x\n\u2202x\u2217\u2202x\u2217\n\n\u2217)\n\n\u02c6\u02d9xj\n\nej\n\nej\n\n\uf8fc\uf8f4\uf8f4\uf8f4\uf8fd\n\uf8f4\uf8f4\uf8f4\uf8fe\n\n(8)\n\nwhere (.)ij denotes the i, j \u2212th entry of the corresponding matrix. Due to space constraints, detailed\ndevelopment of the dual and proof of the above relations are given in appendices A and B of the\nsupplement material.\nNote that since the matrices K, H and H\u2217\u2217 are symmetric, the overall Hessian matrix for the resulting\nquadratic program is also symmetric. However, unlike the standard SVM dual, it may not be positive\nde\ufb01nite resulting in multiple solutions to the above problem. In our implementation, we use the\ninterior point solver IPOPT [19] to \ufb01nd a local optimum. We initialize the iterations using the \u03b1\nfound by running \ufb01rst a standard SVM classi\ufb01cation problem. All entries of \u03b2 and \u03b3 are set to 03.\nThe solution to the above problem yields a modulation function (see Eq. A.11 for proof) given by\n\nhm(x) =\n\nP\n\nXi=1\n\n\u03b1iyik(x, xi) + Xi\u2208I+\n\n\u03b2i\u02c6\u02d9xT\ni\n\n\u2202k(x, xi)\n\n\u2202xi\n\n\u2212\n\nN\n\nXi=1\n\n\u03b3ieT\ni\n\n\u2202k(x, x\u2217)\n\n\u2202x\u2217\n\n+ b\n\n(9)\n\nwhich\nsions\n\ncan\nfor\n\nfurther\n\nbe\nthe Radial Basis\n\nexpanded\n\ndepending on\n\nFunction\n\n(rbf)\n\nthe\nkernel\n\nchoice\nare\n\nof\ngiven\n\nkernel.\nExpan-\nin Appendix C.\n\n7\n\n\u2212\n\n\u22121\n\n0\n\n8\n\n1.5\n\n0.5\n\n1.5\n\n0.5\n\n0\n\n1\n\n0\n\n1\n\n1\n\n7\n\n7\n\n1\n\n0\n\n1\n\n0\n\n\u2212\n\n\u22121\n\n\u2212\n\n\u22120.5\n\n2\n\n2\n\n7\n\n1\n\n\u2212\n\n0.1\n\n0.1\n\n7\n\n7\n\n5\n\n5\n\n5\n\n5\n\n7\n\n2\n\n2\n\n7\n\n0\n\n8\n\n5\n\n5\n\n7\n\n7\n\n5\n\n5\n\n7\n\n5\n\n5\n\n7\n\n\u22120.5\n\n\u22121.5\n\n\u22121.5\n\n\u22122\n\u22122\n\n\u22122\n\u22122\n\n0.0\n1\n\n0.0\n0\n\n0.0\n0\n\n0.0\n1\n\n0.1\n7\n\n0.1\n7\n\n\u22121\n(a) \u03c3 = 1\n\n\u22121\n(b) \u03c3 = 0.5\n\nThe modulation function 9 learned using the A-\nSVM has noticeable similarities with the stan-\ndard SVM classi\ufb01er function. The \ufb01rst sum-\nmation term is composed of the \u03b1 support vec-\ntors (\u03b1-SV) which act as support to the classi-\n\ufb01cation hyperplane. The second term entails a\nnew class of support vectors that perform a lin-\near combination of the normalized velocity \u02c6\u02d9xi\nat the training data points xi. These \u03b2 sup-\nport vectors (\u03b2-SVs) collectively contribute to\nthe ful\ufb01llment of the Lyapunov constraint by\nintroducing a positive slope in the modulation\nfunction value along the directions \u02c6\u02d9xi. Figure 4\nshows the in\ufb02uence of a \u03b2-SV for the rbf kernel\nk(xi, xj) = e1/2\u03c32kxi\u2212xjk2 with xi at the origin\nand \u02c6\u02d9xi = [ 1\u221a2\n]T . It can be seen that the smaller the kernel width \u03c3, the steeper the slope. The\nthird summation term is a non-linear bias, which does not depend on the chosen support vectors, and\nperforms a local modi\ufb01cation around the desired attractor x\u2217 to ensure that the modulation function\nhas a local maximum at that point. b is the constant bias which normalizes the classi\ufb01cation margins\nas \u22121 and +1. We calculate its value by making use of the fact that for all the \u03b1-SV xi, we must\nhave yihm(xi) = 1. We use average of the values obtained from the different support vectors.\nFigure 5 illustrates the effects of the support vectors in a 2D example by progressively adding them\nand overlaying the resulting DS \ufb02ow in each case. The value of the modulation function hm(x) is\nshown by the color plot (white indicates high values). As the \u03b2-SVs are added in Figs. 5(b)-(d),\nthey push the \ufb02ow of trajectories along their associated directions. In Figs. 5(e)-(f), adding the two\n\u03b3 terms shifts the location of the maximum of the modulation function to coincide with the desired\nattractor. Once all the SVs have been taken into account, the streamlines of the resulting DS achieve\nthe desired criteria, i.e., they follow the training trajectories and terminate at the desired attractor.\n\nFigure 4: Isocurves of f (x) = \u02c6\u02d9xT\ni\nxi = [0 0]T , \u02c6\u02d9xi = [ 1\u221a2\n\n]T for the rbf kernel.\n\n\u2202k(x,xi)\n\n1\u221a2\n\n1\u221a2\n\n\u2202xi\n\nat\n\n3Source code for learning is available at http://asvm.epfl.ch\n\n5\n\n\f0.4\n\nY\n\n\u22120.2\n\n0.4\n\nY\n\n\u22120.2\n\n0.4\n\nY\n\n\u22120.2\n\n\u22120.8\n\n\u22120.5\n\n0\nX\n\n(a) \u03b1 only\n\n0.5\n\n\u22120.8\n\n\u22120.5\n\n0\nX\n\n(b) \u03b1 and \u03b2\n\n0.5\n\n\u22120.8\n\n\u22120.5\n\n0\nX\n\n0.5\n\n(c) \u03b1 and \u03b2\n\n0.4\n\nY\n\n\u22120.2\n\n0.4\n\nY\n\n\u22120.2\n\n0.4\n\nY\n\n\u22120.2\n\nModulated Streamlines\nTraining Data\n\n \n\n\u03b1 SV\n\u03b2 SV\n\nObtained attractor\nDesired attractor\n\n\u22120.8\n\n\u22120.5\n\n0\nX\n\n0.5\n\n\u22120.8\n\n\u22120.5\n\n0\nX\n\n0.5\n\n\u22120.8\n\n \n\n\u22120.5\n\n0\nX\n\n0.5\n\n(d) \u03b1 and \u03b2\n\n(e) \u03b1, \u03b2 and \u03b31\n\n(f) \u03b1, \u03b2, \u03b31 and \u03b32\n\nFigure 5: Progressively adding support vectors to highlight their effect on shaping the dynamics\nof the motion. (a) \u03b1-SVs largely affect classi\ufb01cation. (b)-(d) \u03b2-SVs guide the \ufb02ow of trajectories\nalong their respective associated directions \u02c6\u02d9xi shown by arrows. (e)-(f) The 2 \u03b3 terms force the local\nmaximum of the modulation function to coincide with the desired attractor along the X and Y axes\nrespectively.\n\n4 Results\n\nIn this section, we validate the presented A-SVM model on 2D (synthetic) data and on a robotic\nsimulated experiment using a 7 degrees of freedom (DOF) KUKA-LWR arm mounted on a 3-DOF\nOmnirob base to catch falling objects. A video of the robotic experiment - simulated and real -\nis provided in Annexes. Next, we present a cross-validation analysis of the error introduced by\nthe modulation in the original dynamics. A sensitivity analysis of the region of attraction of the\nresulting dynamical system with respect to the model parameters is also presented. We used the\nrbf kernel for all the results presented in this section. As discussed in Section 2, the RBF kernel is\nadvantageous as it ensures that the function hm(x) is bounded. To generate an initial estimate of\neach individual dynamical system, we used the technique proposed in [7].\n\n2D Example Figure 6(a) shows a synthetic example with 4 motion classes, each generated\nfrom a different closed form dynamics and containing 160 data points. The color plot indicates the\nvalue of the combined modulation function \u02dch(x) = max\nhm(x) where each of the functions\nm=1\u00b7\u00b7\u00b7M\nhm(x) are learned using the presented A-SVM technique. A total of 9 support vectors were\nobtained which is < 10% of the number of training data points. The trajectories obtained after\nmodulating the original dynamical systems \ufb02ow along increasing values of the modulation function,\nthereby bifurcating towards different attractors at the region boundaries. Unlike the dynamical\nsystem in Fig. 3, the \ufb02ow here is aligned with the training trajectories and terminates at the desired\nattractors. To recall, this is made possible thanks to the additional constraints (Eq. 4 and 5) in our\nformulation.\n\nIn a second example, we tested the ability of our model to accommodate a higher density of\nattractors. We created 8 synthetic dynamics by capturing motion data using a screen mouse. Figure\n1 shows the resulting 8 attractor system.\n\nError Analysis As formulated in Eq. 6, the Lyapunov constraints admit some slack, which\nallows the modulation to introduce slight deviations from the original dynamics. Here we\nstatistically analyze this error via 5-fold cross validation.\nIn the 4 attractor problem presented\n\n6\n\n\fTraining Data\n\nModulated Trajs.\n\nAttractors\n\n \n\n1\n\n0.5\n\n0\n\n2\n\n1\n\n0\n\n\u22121\n\n\u22122\n\ny\n\n \n\n\u22123\n\n\u22122\n\n\u22121\n\n0\n\nx\n\n1\n\n2\n\n(a) Combined \ufb02ow\n\n40\n\n30\n\n20\n\n10\n\nr\no\nr\nr\nE\ng\nn\n\ni\nt\ns\ne\nT\n%\n\n \n\n0\n0\n\nClass 1\nClass 2\nClass 3\nClass 4\n\n \n\n1\n\n0.8\n\n \n\nTraining\nTesting\n\nr\no\nr\nr\nE\n%\n\n0.6\n\n0.4\n\n5\n\n10\n\u03c3\n\n15\n\n20\n\n0.2\n\n0\n\n \n\nClass 1 Class 2 Class 3 Class 4\n\n(b) Cross validation error\n\n(c) Best case errors\n\nFigure 6: Synthetic 2D case with 4-attractors.\n\nk \u02d9xik\n\n\u00d7 100Ei:li=m\n\nem = D k \u02d9xi\u2212 \u02dcf (xi)k\n\nabove, we generate a total of 10 trajectories per motion class and use 2:3 training to testing ratio\nfor cross validation. We calculate the average percentage error between the original velocity\n(read off from the data) and the modulated velocity (calculated using 2) for the m \u2212 th class as\nwhere < . > denotes average over the indicated range. Figure\n6(b) shows the cross validation error (mean and standard deviation over the 5 folds) for a range\nof values of kernel width. The general trend revealed here is that for each class of motion, there\nexists a band of optimum values of the kernel width for which the testing error is the smallest. The\nregion covered by this band of optimal values may vary depending on the relative location of the\nattractors and other data points. In Fig. 6(a), motion classes 2 (upper left) and 4 (upper right) are\nbetter \ufb01tted and show less sensitivity to the choice of kernel width than classes 1 (lower left) and\n3 (lower right). We will show later in this section that this is correlated to the distance between\nthe attractors. A comparison of testing and training errors for the least error case is shown in Fig.\n6(c). We see that the testing errors for all the classes in the best case scenario are less than 1%.\n\n0.5\n\n0\n\n \n\n \n\n\u22120.5\n\nh(x) = const\n\nROA boundary\n\u22121\nMeshed contour\nActual attractor\nSpurious attractor\n\nSensitivity analysis The partitioning of space created by our\nmethod results in M regions of attraction (ROA) for each of our\nM attractors. To assess the size of these regions and the existence\nof spurious attractors, we adopt an empirical approach. For each\nclass, we compute the isosurfaces of the corresponding modulation\nfunction hm(x) in the range [0, hm(x\u2217)]. These hypersurfaces\nincrementally span the volume of the m \u2212 th region around its\nattractor. We mesh each of these test surfaces and compute trajec-\ntories starting from the obtained mesh-points, looking for spurious\nattractors. hROA is the isosurface of maximal value that encloses\nno spurious attractor and marks the ROA of the corresponding\nmotion dynamics. We use the example in Fig. 5 to illustrate this\nprocess. Figure 7 shows a case where one spurious attractor is\ndetected using a larger test surface (dotted line) whereas the actual\nROA (solid line) is smaller. Once hROA is calculated, we de\ufb01ne\nthe size of ROA as rROA = (h(x\u2217) \u2212 hROA)/h(x\u2217). rROA = 0\nwhen no trajectory except those originating at the attractor itself, lead to the attractor. rROA = 1\nwhen the ROA is bounded by the isosurface h(x) = 0. The size of the rROA is affected by both\nthe choice of kernel width and the distance across nearby attractors. This is illustrated in Fig. 9\nusing data points from class 1 of Fig. 6(a) and translating the attractors so that they are either very\nfar apart (left, distance datt = 1.0) or very close to one another (right, datt = 0.2). As expected,\nrROA increases as we reach the optimal range of parameters. Furthermore, when the attractors are\nfarther apart, high values of rROA are obtained for a larger range of values of the kernel width, i.e.,\nthe model is less sensitive to the chosen kernel width. With smaller distance between the attractors\n(Fig. 9(b)), only a small deviation from the optimum kernel width results in a considerable loss in\nrROA, exhibiting high sensitivity to the model parameter.\n\nFigure 7: Test\ntrajectories\ngenerated from several points\non an isocurve (dotted line) to\ndetermine spurious attractors.\n\n\u22120.5\n\n0\n\n0.5\n\n3D Example We validated our method on a real world 3D problem. The attractors here rep-\nresent manually labeled grasping points on a pitcher. The 3D model of the object was taken\nfrom the ROS IKEA object library. We use the 7-DOF KUKA-LWR arm mounted on the 3-DOF\n\n7\n\n\f0.4\n\n0.2\n\n0\n\n\u22120.2\n\n\u22120.2\n\n0\n\n0.2\n\n0.1\n\n0\n\n\u22120.1\n\n0.4 \u22120.2\n\n0.4\n\n0.2\n\n0\n\n\u22120.2\n\n\u22120.2\n\n0\n\n0.2\n\n0.4\n\n0.1\n\n0\n\u22120.2\n\n\u22120.1\n\n(a) Training data\n\n(b) hm(x) = 0\n\n(c) Trajectory 1\n\n(d) Trajectory 2\n\n(e) Combined \ufb02ow\n\nFigure 8: 3D Experiment. (a) shows training trajectories for three manually chosen grasping points.\n(b) shows the isosurfaces hm(x) = 0; m = 1, 2, 3 along with the locations of the corresponding at-\ntractors. In (c) and (d), the robot executes the generated trajectories starting from different positions\nand hence converging to different grasping points. (e) shows the complete \ufb02ow of motion.\n\nKUKA-Omnirob base for executing the modulated Cartesian trajectories in simulation. We control\nall 10 DOF of the robot using the damped least square inverse kinematics. Training data for this\nimplementation was obtained by recording the end-effector positions xi \u2208 R3 from kinesthetic\ndemonstrations of reach-to-grasp motions directed towards these grasping points, yielding a 3-class\nproblem (see Fig. 8(a)). Each class was represented by 75 data points. Figure 8(b) shows the\nisosurfaces hm(x) = 0; m \u2208 {1, 2, 3} learned using the presented method. Figures 8(c)-(d) show\nthe robot executing two trajectories when started from two different locations and converging to a\ndifferent attractor (grasping point). Figure 8(e) shows the \ufb02ow of motion around the object. Note\nthat the time required to generate each trajectory point is O(S) where S denotes the total number of\nsupport vectors in the model. In this particular example with a total of 18 SVs, the trajectory points\nwere generated at 1000 Hz which is well suited for real-time control. Such a fast generative model\nallows the robot to switch on-the-\ufb02y between the attractors and adapt to real-time perturbations\nin the object or the end-effector pose, without any re-planning or re-learning. Results for another\nobject (champagne glass) are included in Appendix D (Fig. D.1). A video illustrating how the robot\nexploits multiple attractors to catch one of the grasping points on the object as it falls down is also\nprovided in the supplementary material.\n\n5 Conclusions\n\n1400\n\n0.6\n\nC\n\n0.8\n\n0.6\n\n0.8\n\n1200\n\n1000\n\n800\n\nC\n\n \n\n1.0\n\n \n\n1.0\n\n1400\n\n1200\n\n600\n\n400\n\n1000\n\nIn this work, we presented the A-\nSVM model\nfor combining non-\nlinear dynamical systems through a\npartitioning of the space. We refor-\nmulated the optimization framework\nof SVM to encapsulate constraints\nthat ensure accurate reproduction of\nthe dynamics of motion. The new\nset of constraints result in a new class\nof support vectors that exploit partial\nderivatives of the kernel function to\nalign the \ufb02ow of trajectories with the\ntraining data. The resulting model\nbehaves as a multi-stable DS with attractors at the desired locations. Each of the classi\ufb01ed re-\ngions are forward invariant w.r.t the learned DS. This ensures that the trajectories do not cross over\nregion boundaries. We validated the presented method on synthetic motions in 2D and 3D grasping\nmotions on real objects. Results show that even though spurious attractors may occur, in practice\nthey can be avoided by a careful choice of model parameters through grid search. The applicability\nof the method for real-time control of a 10-DOF robot was also demonstrated.\n\nFigure 9: Variation of rROA with varying model parameters.\n\n(b) datt = 0.2\n\n(a) datt = 1.0\n\n1\n\n\u03c3\n\n1\n\n\u03c3\n\n800\n\n600\n\n400\n\n200\n\n0.4\n\n0.2\n\n0.4\n\n0.2\n\n0\n\n2\n\n0\n\n2\n\n200\n\n \n\n \n\n0.5\n\n0.5\n\n1.5\n\n1.5\n\nAcknowledgments\n\nThis work was supported by EU Project First-MM (FP7/2007-2013) under grant agreement number\n248258. The authors would also like thank Prof. Franc\u00b8ois Margot for his insightful comments on\nthe technical material.\n\n8\n\n\fReferences\n\n[1] Peter Pastor, Heiko Hoffmann, Tamim Asfour, and Stefan Schaal. Learning and generalization of motor\nskills by learning from demonstration. In Robotics and Automation, 2009. ICRA \u201909. IEEE International\nConference on, pages 763 \u2013768, may 2009.\n\n[2] G. Sch\u00a8oner and M. Dose. A dynamical systems approach to task-level system integration used to plan\n\nand control autonomous vehicle motion. Robotics and Autonomous Systems, 10(4):253\u2013267, 1992.\n\n[3] G. Sch\u00a8oner, M. Dose, and C. Engels. Dynamics of behavior: Theory and applications for autonomous\n\nrobot architectures. Robotics and Autonomous Systems, 16(2):213\u2013245, 1995.\n\n[4] L.P. Ellekilde and H.I. Christensen. Control of mobile manipulator using the dynamical systems approach.\nIn Robotics and Automation, 2009. ICRA\u201909. IEEE International Conference on, pages 1370\u20131376. IEEE,\n2009.\n\n[5] H. Reimann, I. Iossi\ufb01dis, and G. Sch\u00a8oner. Autonomous movement generation for manipulators with\nIn Robotics and Automation\n\nmultiple simultaneous constraints using the attractor dynamics approach.\n(ICRA), 2011 IEEE International Conference on, pages 5470\u20135477. IEEE, 2011.\n\n[6] K.R. Dixon and P.K. Khosla. Trajectory representation using sequenced linear dynamical systems. In\nRobotics and Automation, 2004. Proceedings. ICRA\u201904. 2004 IEEE International Conference on, vol-\nume 4, pages 3925\u20133930. IEEE, 2004.\n\n[7] S. M. Khansari-Zadeh and Aude Billard. Learning Stable Non-Linear Dynamical Systems with Gaussian\n\nMixture Models. IEEE Transaction on Robotics, 2011.\n\n[8] A. Shukla and A. Billard. Coupled dynamical system based armhand grasping model for learning fast\n\nadaptation strategies. Robotics and Autonomous Systems, 60(3):424 \u2013 440, 2012.\n\n[9] H. Hoffmann. Target switching in curved human arm movements is predicted by changing a single control\n\nparameter. Experimental brain research, 208(1):73\u201387, 2011.\n\n[10] C. Rasmussen. Gaussian processes in machine learning. Advanced Lectures on Machine Learning, pages\n\n63\u201371, 2004.\n\n[11] S. Schaal, C.G. Atkeson, and S. Vijayakumar. Scalable techniques from nonparametric statistics for real\n\ntime robot learning. Applied Intelligence, 17(1):49\u201360, 2002.\n\n[12] Auke Jan Ijspeert, Jun Nakanishi, and Stefan Schaal. Movement imitation with nonlinear dynamical sys-\ntems in humanoid robots. In In IEEE International Conference on Robotics and Automation (ICRA2002,\npages 1398\u20131403, 2002.\n\n[13] A. Fuchs and H. Haken. Pattern recognition and associative memory as dynamical processes in a syner-\ngetic system. i. translational invariance, selective attention, and decomposition ofscenes. Biol. Cybern.,\n60:17\u201322, November 1988.\n\n[14] A.N. Michel and J.A. Farrell. Associative memories via arti\ufb01cial neural networks. Control Systems\n\nMagazine, IEEE, 10(3):6 \u201317, apr 1990.\n\n[15] H. Jaeger, M. Lukosevicius, D. Popovici, and U. Siewert. Optimization and applications of echo state\n\nnetworks with leaky-integrator neurons. Neural Networks, 20(3):335\u2013352, 2007.\n\n[16] J. Lee. Dynamic gradient approaches to compute the closest unstable equilibrium point for stability region\nestimate and their computational limitations. Automatic Control, IEEE Transactions on, 48(2):321\u2013324,\n2003.\n\n[17] H.D. Chiang and C.C. Chu. A systematic search method for obtaining multiple local optimal solutions of\nnonlinear programming problems. Circuits and Systems I: Fundamental Theory and Applications, IEEE\nTransactions on, 43(2):99\u2013109, 1996.\n\n[18] B. Sch\u00a8olkopf and A.J. Smola. Learning with kernels: Support vector machines, regularization, optimiza-\n\ntion, and beyond. MIT press, 2001.\n\n[19] Andreas Wchter and Lorenz T. Biegler. On the implementation of an interior-point \ufb01lter line-search\n\nalgorithm for large-scale nonlinear programming. Mathematical Programming, 106:25\u201357, 2006.\n\n9\n\n\f", "award": [], "sourceid": 494, "authors": [{"given_name": "Ashwini", "family_name": "Shukla", "institution": null}, {"given_name": "Aude", "family_name": "Billard", "institution": null}]}