{"title": "Application of SVMs for Colour Classification and Collision Detection with AIBO Robots", "book": "Advances in Neural Information Processing Systems", "page_first": 635, "page_last": 642, "abstract": "", "full_text": "Application of SVMs for Colour Classi\ufb01cation\n\nand Collision Detection with AIBO Robots\n\nMichael J. Quinlan, Stephan K. Chalup and Richard H. Middleton\u2217\n\nSchool of Electrical Engineering & Computer Science\nThe University of Newcastle, Callaghan 2308, Australia\n\n{mquinlan,chalup,rick}@eecs.newcastle.edu.au\n\nAbstract\n\nThis article addresses the issues of colour classi\ufb01cation and collision de-\ntection as they occur in the legged league robot soccer environment of\nRoboCup. We show how the method of one-class classi\ufb01cation with sup-\nport vector machines (SVMs) can be applied to solve these tasks satisfac-\ntorily using the limited hardware capacity of the prescribed Sony AIBO\nquadruped robots. The experimental evaluation shows an improvement\nover our previous methods of ellipse \ufb01tting for colour classi\ufb01cation and\nthe statistical approach used for collision detection.\n\n1 Introduction\n\nAutonomous agents offer a wide range of possibilities to apply and test machine learning\nalgorithms, for example in vision, locomotion, and localisation. However, training-time\nrequirements of sophisticated machine learning algorithms can overstrain the hardware of\nreal world robots. Consequently, in most cases, ad hoc methods, hard coding of expert\nknowledge, and hand-tuning of parameters, or similar approaches were preferred over the\nuse of learning algorithms on the robot. Application of the latter was often restricted to\nsimulations which sometimes could support training or tuning of the real world robot pa-\nrameters. However, often the gap between simulation and the real world was too wide\nso that a transfer of training results from the simulated to the real robot turned out to be\nuseless.\n\nA few years ago it may have been regarded as infeasible to consider the use of support\nvector machines [1, 2, 3] on real world robots with restricted processing capabilities. Dur-\ning the \ufb01rst years after their invention support vector machines had the reputation to be\nmore a theoretical concept than a method which could be ef\ufb01ciently applied in real world\nsituations. One of the main reasons for this was complexity of the quadratic programming\npart. In recent years it has become possible to speed up optimisations for SVMs in various\nways [4]. SVMs have since been successfully applied on many tasks but primarily in the\nareas of data mining and pattern classi\ufb01cation.\n\nWith the present study we explore the feasibility and usefulness of one-class SVM clas-\nsi\ufb01cation [5] for tasks faced by AIBO robots within the legged league environment of\nRoboCup [6]. We focus on two particularly critical issues: detection of objects based on\n\n\u2217http://www.robots.newcastle.edu.au\n\n\fcorrect colour classi\ufb01cation and detection of robot-to-robot collisions. Both issues seemed\nnot to be suf\ufb01ciently solved and implemented by the teams of RoboCup2002 and caused\nsigni\ufb01cant deterioration in the quality of play even in the world-best teams of that league.\n\nThe article has \ufb01ve more sections addressing the environment and tasks, the methods, fol-\nlowed by the experiments and applications for colour classi\ufb01cation and collision detection,\nrespectively. The article concludes with a summary.\n\n2 Environment and tasks\n\nThe restricted real world environment and the uniformly prescribed hardware of the legged\nleague [6] of RoboCup provide a good compromise for testing machine learning algorithms\non autonomous agents with a view towards possible applications in more general real world\nenvironments.\n\nA soccer team in the legged league consists of four robots, including one goal keeper. Each\nteam is identi\ufb01ed by robots wearing either a red or blue coloured \u2018uniform\u2019. The soccer\nmatches take place on a green enclosed carpeted \ufb01eld with white boundaries. Two goals,\na blue and a yellow, are positioned on opposite ends of the \ufb01eld. To aid localisation six\nbeacons are placed regularly around the \ufb01eld, each uniquely identi\ufb01able by a speci\ufb01c colour\npattern. The ball used is orange plastic and of a suitable size to be easily moved around\nby the robots. The games consist of two ten minute halves under strict rules imposed by\nindependent referees.\n\nThe legged league of RoboCup 2003 prescribed the use of Sony AIBO entertainment\nrobots, models ERS-210 or the newer ERS-210A. Both have an internal 64-bit RISC\nprocessor with clock speeds of 192MHz and 384MHz, respectively. The robots are pro-\ngrammed in a C++ software environment using the Sony\u2019s OPEN-R software development\nkit [7]. They have 16MB of memory accessible by user programs. The dimensions of the\nrobot (width \u00d7 height \u00d7 length) are 154 mm \u00d7 266 mm \u00d7 274 mm (not including the\ntail and ears) and the mass is approximately 1.4 kg. The AIBO has 20 degrees of freedom\n(DOF): neck 3DOF (pan, tilt, and roll), ear 1DOF x 2, chin 1DOF, legs 3DOF (abductor,\nrotator, knee) x 4 and tail 2DOF (up-down, left-right).\n\nAmong other sensors the AIBO has a 1/6 inch colour CMOS camera capable of 25 frames\nper seconds. The images are gathered at a resolution of 352(H) \u00d7 288(V) but middleware\nrestricts the available resolution to a maximum of 176(H) \u00d7 144(V). The lens has an aper-\nture of 2.0 and a focal length of 2.18 mm. Additionally, the camera has a \ufb01eld of vision\nof 23.9\u25e6 up and down and 28.8\u25e6 left and right. To help achieve results in different lighting\nconditions the camera allows the modi\ufb01cation of parameters: White balance, Shutter Speed\nand Gain.\n\n2.1 Colour classi\ufb01cation task\n\nThe vision system for most teams consists of four main tasks, Colour Classi\ufb01cation, Run\nLength Encoding, Blob Formation and Object Recognition (Figure 1).\n\nThe classi\ufb01cation process takes the image from the camera in a YUV bitmap format [8].\nEach pixel in the image is assigned a colour label (i.e. ball orange, beacon pink etc.)\nbased on its YUV values. A lookup table (LUT) is used to determine which YUV values\ncorrespond to which colour labels. The critical point is the initial generation of the LUT.\nSince the robot is extremely reliant on colour for object detection a new LUT has to be\ngenerated with any change in lighting conditions. Currently this is a manual task which\nrequires a human to take hundreds of images and assign a colour label on a pixel-by-pixel\nbasis. Using this method each LUT can take hours to create, yet it will still contain holes\nand classi\ufb01cation errors.\n\n\fFigure 1: Vision System of the NUbots Legged League Team [9]\n\n2.2 Collision detection task\n\nThe goal is to detect collisions using the limited sensors provided by the AIBO robot. The\ncamera and infrared distance sensor on the AIBO don\u2019t provide enough support in avoiding\nobstacles unless the speed of the robot is dramatically decreased. For these reasons we have\nchosen to use information obtained from the joint sensors (i.e. the angle of the joint) as the\ninput to our collision detection system [10].\n\n3 One-class SVM classi\ufb01cation method\n\nAn approach to one-class SVM classi\ufb01cation was proposed by Sch\u00a8olkopf et al. [5]. Their\nstrategy is to map data into the feature space corresponding to the kernel function and to\nseparate them from the origin with maximum margin. This implies the construction of a\nhyperplane such that w \u00b7 \u03a6(xi)\u2212 p \u2265 0. The result is a function f that returns the value +1\nin the region containing most of the data points and -1 elsewhere. Assuming the use of an\nRBF kernel and i, j \u2208 1, ..., (cid:96), we are presented with the dual problem:\n\nmin\n\u03b1\n\n1\n2\n\n\u03b1i\u03b1jk(xi, xj) subject to 0 \u2264 \u03b1i \u2264 1\n\u03bd(cid:96)\n\n,\n\n\u03b1i = 1\n\n(1)\n\n(cid:88)\n\nij\n\np can be found by the fact that for any such \u03b1i, a corresponding pattern xi satis\ufb01es:\n\np =\n\n\u03b1jk(xj, xi)\n\n(cid:88)\n(cid:88)\n\nj\n\ni\n\nThe resulting decision function f (the support of the distribution) is:\n\nf(x) = sign(\n\n\u03b1ik(xi, x) \u2212 p)\n\n(cid:88)\n\ni\n\n(cid:88)\n\nAn implementation of this approach is available in the LIBSVM library [11]. It solves a\nscaled version of (1):\n\n\u03b1i\u03b1jk(xi, xj) subject to 0 \u2264 \u03b1i \u2264 1 ,\n\n\u03b1i = \u03bd(cid:96)\n\nij\n\ni\n\n(cid:88)\n\nmin\n\u03b1\n\n1\n2\n\nFor our applications we use a RBF kernel with parameter \u03b3 in the form k(x, y) =\ne\u2212\u03b3(cid:107)x\u2212y(cid:107)2. The parameter \u03bd approximates the fraction of outliers and support vectors [5].\n\n3.1 Method for colour classi\ufb01cation\n\n(cid:169)\n\n(cid:170)\n\nThe classi\ufb01cation functions we seek take data that has been manually clustered to produce\nsets X k =\nof colour space data for each object colour k. Each\n\ni \u2208 R3; i = 1, ..., Nk\nxk\n\nCMOS CameraColour LookupTable3x6bitYUVPixels176x144 Pixels(3x8bitYUV)Run Length Encoding& Blob Formation176x144 Pixels(Enum)ImageObject RecognitionList of BlobsList of Objects\fX k corresponds to sets of colour values in the YUV space corresponding to one of the\nknown colour labels.\nAn individual one-class SVM is created for each colour, with X k being used as the training\ndata (each element in the set is scaled between -1 and 1). By training with an extremely\nlow \u03bd and a large \u03b3 the boundary formed by the decision function approximates the region\nthat contains the majority (1-\u03bd) of the points in X k. In addition the SVM has the advantage\nof simultaneously removing the outliers that occur during manual classi\ufb01cation.\n\nThe new colour set is constructed by attempting to classify every point in the YUV space\n(643 elements). All points that return a value of +1 are inside the region and therefore\ndeemed to be of colour k.\n\nOne-class SVM was chosen because it allows us to optimally treat each individual colour.\nTo avoid misclassi\ufb01cation each point in YUV space that does not strongly correspond to\none of the known colours must remain classi\ufb01ed as unknown. In addition the colours were\noriginally selected because they are located in different areas of the YUV space. Because\nof this we can choose to treat each colour without regard to the location and shape of the\nother colours. For these reasons we are not interested in using a multi-class technique to\nform a hyperplane that provides an optimal separation between the colours.\n\n3.2 Method for collision detection\n\nFor collision detection the one-class SVM is employed as a novelty detection mechanism.\nIn our implementation each training point is a vector containing thirteen elements. These\ninclude \ufb01ve walk parameters, stepFrequency, backStrideLength, turn, strafe and timePa-\nrameter along with a sensor reading from the abductor and rotator joints on each of the\nfour legs. Upon training the SVMs decision function will return +1 for all values that relate\nto a \u201cnormal\u201d step, and -1 for all steps that contain a fault.\n\nSpeed is of the greatest importance in the Robocup domain. For this reason a collision\ndetection system must attempt to minimise the generation of false-positives (detecting a\ncollision that we deemed not to have happened) while still \ufb01nding a high percentage of\nactual collisions. Low false-positives are achieved by keeping the kernel parameter \u03b3 high\nbut this has the side effect of lowering the generalisation to the data set, which results in\nthe need for an increased number of training points. In a real world robotic system the\nneed for more training points greatly increases the training time and in-turn the wear on the\nmachinery.\n\n4 Experiments and application to colour classi\ufb01cation\n\nThe SVM can be used in two situations during the colour classi\ufb01cation procedure. Firstly\nduring the construction of a new LUT where it can be applied to increase the speed of\nclassi\ufb01cation.\n\nBy lowering \u03b3 while the number of training points is low, a rough estimation of the \ufb01nal\nshape can be obtained. By continuing the manual classi\ufb01cation and increasing \u03b3 a closer\napproximation to the area containing the training data is obtained. In this manner a contin-\nually improving LUT can be constructed until it is deemed adequate.\n\nAn extreme example of this application is during the set-up phase at a competition. In the\npast when we arrived at a new venue all system testing was delayed until the generation\nof a LUT. Of critical importance is testing the locomotion engine on the new carpet and in\nparticular ball chasing. The task of ball chasing relies on the classi\ufb01cation of ball orange.\nThus a method of quickly but roughly classifying orange is valuable. By manually classi-\nfying a few images of the ball and then training the SVM with \u03b3 < 0, a sphere containing\n\n\fall possible values for the ball is generated.\n\nThe second situation in which we use the one-class SVM is on a completed LUT. Either all\ncolours in the table can be trained (i.e. updating of an old table) or an individual colour is\ntrained due to an initial classi\ufb01cation error. This procedure can be performed either on the\nrobot or a remote computer.\nEmpirical tests have indicated that \u03bd = 0.025 and \u03b3 = 250 provide excellent results on a\npreviously constructed LUT. The initial table contained 3329 entries while after training\nthe table contained 6989 entries. The most evident change can be seen in the classi\ufb01cation\nof colour white, see Figure 2.\n\nThe LUTs were compared over 60 images, which equates to 1,520,640 individual pixel\ncomparisons. The initial table generated 144,098 classi\ufb01cation errors. The new LUT pro-\nduced 117,652 errors, this equates to an 18% reduction in errors.\n\nFigure 2: Image Comparison: The left image is classi\ufb01ed with the original LUT and the\nimage on the right is the using the updated LUT. Black pixels indicate an unknown colour.\n\n4.1 Comparison with ellipsoid \ufb01tting\n\nThe previous method involved converting the existing LUT values from YUV to the HSI\ncolour space [8] and \ufb01tting an ellipsoid, E, which can be represented by the quadratic form:\n\nE (x0, Q) =\n\nx \u2208 R3 : (x \u2212 x0)T Q\u22121 (x \u2212 x0) \u2264 1\n\n(2)\n\n(cid:110)\n\n(cid:111)\n\nwhere x0 is the centre of the ellipsoid, and the size, orientation and shape of the ellipsoid\nare contained in the positive de\ufb01nite symmetric matrix Q = QT > 0 \u2208 R3\u00d73.\nNote that this de\ufb01nition of the shape can be alternatively represented by the linear matrix\ninequality (LMI):\n\n(cid:183)\n\n(cid:184)\n\nxi \u2208 E =\n\nQ\n\n(xi \u2212 x0)T\n\n(xi \u2212 x0)\n\n1\n\n\u2265 0\n\n(3)\n\nThe LMI (3) is linear in the unknowns Q and x0 and this therefore leads to the convex\noptimisation:\n\n(Q, x0) =\n\nargmin\n\nQ = QT > 0, x0 :\n\n(3) is true for i = 1..Nk\n\n{tr(Q)}\n\nNote that minimising the trace of Q (tr(Q)) is the same as minimising the sum of the\ndiagonal elements of Q which is the same as minimising the sum of the squares of the\n\n\flengths of the principal axes of the ellipsoid. The ellipsoidal shape de\ufb01ned in (2) has the\ndisadvantage of restricting the shape of possible regions in the colour space. However, it\ndoes have the advantage of having a simple representation and a convex shape.\n\nBefore the ellipsoid can be \ufb01tted, potential outliers and duplicate points were identi\ufb01ed and\nremoved. The removal of outliers is important in avoiding too large a region. Duplicate\npoints were removed, since these increase computations without adding any information.\n\nFor the comparison we use the initial LUT from the above example. Figure 3 shows the\neffects of each method on the colour white. To make the comparison with ellipsoids, the\ninitial LUT and the generated LUT from the SVM procedure are shown in the HSI colour\nspace.\n\nFigure 3: Colour classi\ufb01cation in HSI colour space: A) Points manually classi\ufb01ed at\nwhite. B) Ellipsoid \ufb01tted to these white points. C) Result of the one-class SVM technique,\n\u03bd=0.025 and \u03b3=10. D) Result of the one-class SVM technique, \u03bd=0.025 and \u03b3=250.\n\nIt is evident that the manual classi\ufb01cation of white is rather incomplete and contains many\nholes that should be classi\ufb01ed as white. The negative results of these holes can be seen as\nnoise in the left image of Figure 2.\n\nUsing the ellipsoid \ufb01tting method these holes are \ufb01lled but with the potential drawback\nof over classi\ufb01cation. From image B in Figure 3 it is evident that the top section and the\nbottom left of the ellipsoid contain no white entries and therefore it is highly questionable\nthat this area should be classi\ufb01ed as white.\n\nImages C and D in the \ufb01gure show the results of our one-class SVM method. It is clear\nfrom image D that the area now classi\ufb01ed as white is a region that tightly \ufb01ts the original\ntraining set.\n\n5 Experiments and application to collision detection\n\nThe collision detection system is designed with the aim that the entire system can be run on\nthe robot. This means adhering to the memory and processing capabilities of the device. On\nthe AIBO we have a maximum of 8MB memory available for collision detection, a total of\n\n\f20,000 training points. This is the equivalent of 1000 steps which equates to approximately\n10 minutes of training time. The training set is generated by having the robot behave\nnormally on the \ufb01eld but with the stipulation that all collisions are avoided.\n\nThe trained classi\ufb01er analyses the on-line stream of joint data measurements in samples\nof ten consecutive data points. If more than 2 points in one sample are classi\ufb01ed as -1 a\ncollision is declared to be detected.\n\nInitial parameters of \u03bd = 0.05 and \u03b3 = 5 were chosen, this was based on the assumption\nthat a collision point would lie considerably outside the training set. The results from these\nparameters were less then satisfying, only the largest of collisions (i.e. physically holding\nmultiple legs) were detected. The solution to this problem could involve increasing \u03bd due\nto the possibility that the initial training set contained many outliers and/or increasing \u03b3 to\nimprove the tightness of the classi\ufb01cation.\n\nBy a series of tests, all of which tended to lead to either an over classi\ufb01cation or an under\nclassi\ufb01cation, parameters of \u03bd = 0.05 and \u03b3 = 100 were settled on. In our system these\nparameters appear to give the best balance between minimising false-positives and max-\nimising correct detection of collisions.\n\n5.1 Comparison with the previous statistical method\n\nThe previous method, described in [10], for collision detection involves observing a joint\nposition substantially differing from its expected value.\nIn our case an empirical study\nfound two standard deviations to be a practical measure, see Figure 4. Initially we would\nhave considered a collision to have occurred if a single error is found, but further investi-\ngation has shown that \ufb01nding multiple errors (in most cases three) in quick succession is\nnecessary to warrant a warning that can be acted upon by the robot\u2019s behaviour system.\n\nFigure 4: Rear Rotators for a forwards walking boundary collision on both front legs, front\nright leg hitting \ufb01rst. The bold line shows the path of a collided motion. The dotted line\nrepresents the mean \u201cnormal\u201d path of the joint (that is, during unobstructed motion), with\nthe error bars indicating two standard deviations above and below.\n\nOne drawback of this method is that it relied on domain knowledge to arrive at two standard\ndeviations. In addition it required considerable storage space to hold the table of means and\nstandard deviations for each parameter combination.\n\nThe previous statistical method had the advantage of extremely low computational expense,\nin fact it was a table look up. The trade-off is increased space, this method required the\nallocation of approximately 6MB of memory during both the training and detection stages.\nConversely the SVM approach requires only about 1MB of memory during the detection\nphase, but this comes at the side effect of increased computation. Since the SVM approach\nwas capable of running without reducing the frame rate, the extra memory could now be\nused for other applications.\n\n\fWith respect to accuracy the SVM approach slightly outperformed the original statistical\nmethod for particular types of steps, these include the common steps associated with chas-\ning the ball. Other step types, such as an aggressive turn did not show the same improve-\nment. This is due to the movement of the joints in some motions being more inconsistent,\nthus making accurate classi\ufb01cation harder.\n\nA possible solution may involve using multiple SVMs associated with different combi-\nnations of walk parameters, allowing the tuning of parameters on a speci\ufb01c basis. This\nsolution would have the downside of requiring more memory.\n\n6 Summary\n\nThe method of one-class classi\ufb01cation with SVMs was successfully applied to the tasks\nof colour classi\ufb01cation and collision detection using the restricted memory and processing\npower of the AIBO hardware.\nIt was possible to run the SVM algorithm implemented\nin the C++ libraries of LIBSVM off and on the robot. In a comparison with previously\nused methods the SVM based methods generated better results, and in the case of colour\nclassi\ufb01cation the SVM approach was more ef\ufb01cient and convenient.\n\nAcknowledgments\n\nWe would like to thank William McMahan and Jared Bunting for their work on the previ-\nous vision classi\ufb01cation method and Craig Murch for his extensive contributions to both\nthe vision and locomotion systems. Michael J. Quinlan was supported by a University of\nNewcastle Postgraduate Research Scholarship.\n\nReferences\n\n[1] B. E. Boser, I. M. Guyon, and V. N. Vapnik. A training algorithm for optimal margin classi-\n\ufb01ers. In D. Haussler, editor, Proceedings of the 5th Annual ACM Workshop on Computational\nLearning Theory, pages 144\u2013152, Pittsburgh, PA, July 1992. ACM Press.\n\n[2] C. Cortes and V. Vapnik. Support vector networks. Machine Learning, 20:273 \u2013 297, 1995.\n[3] V. Vapnik. The Nature of Statistical Learning Theory. Springer Verlag, New York, 1995.\n[4] Bernhard Sch\u00a8olkopf and Alexander J. Smola. Learning with Kernels, Support Vector Machines,\n\nRegularization, Optimization and Beyond. The MIT Press, 2002.\n\n[5] B. Sch\u00a8olkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson. Estimating the\n\nsupport of a high-dimensional distribution. Neural Computation, 13:1443\u20131471, 2001.\n\n[6] RoboCup Legged League web site. http://www.openr.org/robocup/index.html.\n[7] OPEN-R SDK. http://openr.aibo.com.\n[8] Linda G. Shapiro and George C. Stockman. Computer Vision. Prentice Hall, 2001.\n[9] J. Bunting, S. Chalup, M. Freeston, W. McMahan, R. Middleton, C. Murch, M. Quinlan,\nC. Seysener, and G. Shanks. Return of the NUbots! The 2003 NUbots Team Report, 2003.\nhttp://robots.newcastle.edu.au/publications/NUbotFinalReport2003.pdf.\n\n[10] Michael J. Quinlan, Craig L. Murch, Richard H. Middleton, and Stephan K. Chalup. Traction\n\nmonitoring for collision detection with legged robots. In RoboCup 2003 Symposium, 2003.\n\n[11] Chih-Chung Chang and Chih-Jen Lin. LIBSVM: a library for support vector machines, 2001.\n\nSoftware available at http://www.csie.ntu.edu.tw/\u02dccjlin/libsvm.\n\n\f", "award": [], "sourceid": 2487, "authors": [{"given_name": "Michael", "family_name": "Quinlan", "institution": null}, {"given_name": "Stephan", "family_name": "Chalup", "institution": null}, {"given_name": "Richard", "family_name": "Middleton", "institution": null}]}