{"title": "Self-Organization of Associative Database and Its Applications", "book": "Neural Information Processing Systems", "page_first": 767, "page_last": 774, "abstract": null, "full_text": "767 \n\nSELF-ORGANIZATION OF ASSOCIATIVE DATABASE \n\nAND ITS APPLICATIONS \n\nHisashi Suzuki and Suguru Arimoto \n\nOsaka University, Toyonaka, Osaka 560, Japan \n\nABSTRACT \n\nAn efficient method of self-organizing associative databases is proposed together with \napplications to robot eyesight systems. The proposed databases can associate any input \nwith some output. In the first half part of discussion, an algorithm of self-organization is \nproposed. From an aspect of hardware, it produces a new style of neural network. In the \nlatter half part, an applicability to handwritten letter recognition and that to an autonomous \nmobile robot system are demonstrated. \n\nINTRODUCTION \n\nLet a mapping f : X -+ Y be given. Here, X is a finite or infinite set, and Y is another \nfinite or infinite set. A learning machine observes any set of pairs (x, y) sampled randomly \nfrom X x Y. (X x Y means the Cartesian product of X and Y.) And, it computes some \nestimate j : X -+ Y of f to make small, the estimation error in some measure. \n\nUsually we say that: the faster the decrease of estimation error with increase of the num(cid:173)\n\nber of samples, the better the learning machine. However, such expression on performance \nis incomplete. Since, it lacks consideration on the candidates of J of j assumed prelimi(cid:173)\nnarily. Then, how should we find out good learning machines? To clarify this conception, \nlet us discuss for a while on some types of learning machines. And, let us advance the \nunderstanding of the self-organization of associative database . \n\n. Parameter Type \nAn ordinary type of learning machine assumes an equation relating x's and y's with \nparameters being indefinite, namely, a structure of f. It is equivalent to define implicitly a \nset F of candidates of 1. (F is some subset of mappings from X to Y.) And, it computes \nvalues of the parameters based on the observed samples. We call such type a parameter \ntype. \n\nFor a learning machine defined well, if F 3 f, j approaches f as the number of samples \nincreases. In the alternative case, however, some estimation error remains eternally. Thus, \na problem of designing a learning machine returns to find out a proper structure of f in this \nsense. \n\nOn the other hand, the assumed structure of f is demanded to be as compact as possible \nto achieve a fast learning. In other words, the number of parameters should be small. Since, \nif the parameters are few, some j can be uniquely determined even though the observed \nsamples are few. However, this demand of being proper contradicts to that of being compact. \nConsequently, in the parameter type, the better the compactness of the assumed structure \nthat is proper, the better the learning machine. This is the most elementary conception \nwhen we design learning machines . \n\n. Universality and Ordinary Neural Networks \nNow suppose that a sufficient knowledge on f is given though J itself is unknown. In \nthis case, it is comparatively easy to find out proper and compact structures of J. In the \nalternative case, however, it is sometimes difficult. A possible solution is to give up the \ncompactness and assume an almighty structure that can cover various 1's. A combination \nof some orthogonal bases of the infinite dimension is such a structure. Neural networks 1,2 \nare its approximations obtained by truncating finitely the dimension for implementation. \n\n\u00a9 American Institute of Physics 1988 \n\n\f768 \n\nA main topic in designing neural networks is to establish such desirable structures of 1. \nThis work includes developing practical procedures that compute values of coefficients from \nthe observed samples. Such discussions are :flourishing since 1980 while many efficient meth(cid:173)\nods have been proposed. Recently, even hardware units computing coefficients in parallel \nfor speed-up are sold, e.g., ANZA, Mark III, Odyssey and E-1. \n\nNevertheless, in neural networks, there always exists a danger of some error remaining \neternally in estimating /. Precisely speaking, suppose that a combination of the bases of a \nfinite number can define a structure of 1 essentially. In other words, suppose that F 3 /, or \n1 is located near F. In such case, the estimation error is none or negligible. However, if 1 \nis distant from F, the estimation error never becomes negligible. Indeed, many researches \nreport that the following situation appears when 1 is too complex. Once the estimation \nerror converges to some value (> 0) as the number of samples increases, it decreases hardly \neven though the dimension is heighten. This property sometimes is a considerable defect of \nneural networks . \n\n. Recursi ve Type \nThe recursive type is founded on another methodology of learning that should be as \nfollows. At the initial stage of no sample, the set Fa (instead of notation F) of candidates \nof I equals to the set of all mappings from X to Y. After observing the first sample \n(Xl, Yl) E X x Y, Fa is reduced to Fi so that I(xt) = Yl for any I E F. After observing \nthe second sample (X2' Y2) E X x Y, Fl is further reduced to F2 so that i(xt) = Yl and \nI(X2) = Y2 for any I E F. Thus, the candidate set F becomes gradually small as observation \nof samples proceeds. The i after observing i-samples, which we write i\" is one of the most \nlikelihood estimation of 1 selected in fi;. Hence, contrarily to the parameter type, the \nrecursive type guarantees surely that j approaches to 1 as the number of samples increases. \nThe recursive type, if observes a sample (x\" yd, rewrites values 1,-l(X),S to I,(x)'s for \nsome x's correlated to the sample. Hence, this type has an architecture composed of a rule \nfor rewriting and a free memory space. Such architecture forms naturally a kind of database \nthat builds up management systems of data in a self-organizing way. However, this database \ndiffers from ordinary ones in the following sense. It does not only record the samples already \nobserved, but computes some estimation of l(x) for any x E X. We call such database an \nassociative database. \n\nThe first subject in constructing associative databases is how we establish the rule for \nrewri ting. For this purpose, we adap t a measure called the dissimilari ty. Here, a dissimilari ty \nmeans a mapping d : X x X -+ {reals > O} such that for any (x, x) E X x X, d(x, x) > 0 \nwhenever l(x) t /(x). However, it is not necessarily defined with a single formula. It is \ndefinable with, for example, a collection of rules written in forms of \"if\u00b7 .. then\u00b7\u00b7 .. \" \n\nThe dissimilarity d defines a structure of 1 locally in X x Y. Hence, even though \nthe knowledge on f is imperfect, we can re:flect it on d in some heuristic way. Hence, \ncontrarily to neural networks, it is possible to accelerate the speed of learning by establishing \nd well. Especially, we can easily find out simple d's for those l's which process analogically \ninformation like a human. \n(See the applications in this paper.) And, for such /'s, the \nrecursive type shows strongly its effectiveness. \nWe denote a sequence of observed samples by (Xl, Yd, (X2' Y2),\u00b7\u00b7\u00b7. One of the simplest \nconstructions of associative databases I, after observing i-samples (i = 1,2,.,,) is as follows. \n\nAlgorithm 1. At the initial stage, let So be the empty set. For every i = \n1,2\" .. , let i,-l(x) for any x E X equal some y* such that (x*,y*) E S,-l and \n(1) \n\nd(x, x*) = min \n\nFurthermore, add (x\" y,) to S;-l to produce Sa, i.e., S, = S,_l U {(x\" y,n. \n\nd(x, x) . \n\n(%,y)ES.-t \n\n\f769 \n\nAnother version improved to economize the memory is as follows. \n\nAlgorithm 2, At the initial stage, let So be composed of an arbitrary element \nin X x Y. For every i = 1,2\"\", let ii-lex) for any x E X equal some y. such \nthat (x\u00b7, y.) E Si-l and \n\nd(x, x\u00b7) = min d(x, x) . \n\n(i,i)ES.-l \n\nFurthermore, if ii-l(Xi) # Yi then let Si = Si-l, or add (Xi, Yi) to Si-l to \nproduce Si, i.e., Si = Si-l U {(Xi, Yi)}' \n\nIn either construction, ii approaches to f as i increases. However, the computation time \ngrows proportionally to the size of Si. The second subject in constructing associative \ndatabases is what addressing rule we should employ to economize the computation time. In \nthe subsequent chapters, a construction of associative database for this purpose is proposed. \nIt manages data in a form of binary tree. \n\nSELF-ORGANIZATION OF ASSOCIATIVE DATABASE \n\nGiven a sample sequence (Xl, Yl), (X2' Y2), .. \" the algorithm for constructing associative \n\ndatabase is as follows. \n\nAlgorithm 3,' \nStep I(Initialization): Let (x[root], y[root]) = (Xl, Yd. Here, x[.] and y[.] are \nvariables assigned for respective nodes to memorize data.. Furthermore, let t = 1. \nStep 2: Increase t by 1, and put x, in. After reset a pointer n to the root, repeat \nthe following until n arrives at some terminal node, i.e., leaf. \n\nNotations nand n mean the descendant nodes of n. If d(x\" r[n)) ~ \nd(xt, x[n)), let n = n. Otherwise, let n = n. \n\nStep 3: Display yIn] as the related information. Next, put y, in. If yIn] = y\" back \nto step 2. Otherwise, first establish new descendant nodes n and n. Secondly, \nlet \n\n(x[n], yIn)) \n(x[n], yIn)) \n\n(x[n], yIn)), \n(Xt, y,). \n\n(2) \n(3) \n\nFinally, back to step 2. Here, the loop of step 2-3 can be stopped at any time \nand also can be continued. \n\nNow, suppose that gate elements, namely, artificial \"synapses\" that play the role of branch(cid:173)\ning by d are prepared. Then, we obtain a new style of neural network with gate elements \nbeing randomly connected by this algorithm. \n\nLETTER RECOGNITION \n\nRecen tly, the vertical slitting method for recognizing typographic English letters3 , the \nelastic matching method for recognizing hand written discrete English letters4 , the global \ntraining and fuzzy logic search method for recognizing Chinese characters written in square \nstyleS, etc. are published. The self-organization of associative database realizes the recogni(cid:173)\ntion of handwritten continuous English letters. \n\n\f770 \n\nH \n\n9 /wn\" NOV ~ ~ ~ -xk :La.t \n~~ ~ ~~~ dw1lo' \n~~~~~of~~ \n~~~ 4,-\u00a5~~4-\n\nFig. 1. Source document. \n\n2~~---------------' lOO~---------------' \n\no \n\nFig. 2. Windowing. \n\no \n\n2000 \n\n4000 \n\n1000 \n3000 \nNumber of samples \nFig. 3. An experiment result. \n\no \n\n1000 \n3000 \nNUAlber of sampl es \n\n2000 \n\n4000 \n\nAn image scanner takes a document image (Fig. 1). The letter recognizer uses a par(cid:173)\n\nallelogram window that at least can cover the maximal letter (Fig. 2), and processes the \nsequence of letters while shifting the window. That is, the recognizer scans a word in a \nslant direction. And, it places the window so that its left vicinity may be on the first black \npoint detected. Then, the window catches a letter and some part of the succeeding letter. \nIf recognition of the head letter is performed, its end position, namely, the boundary line \nbetween two letters becomes known. Hence, by starting the scanning from this boundary \nand repeating the above operations, the recognizer accomplishes recursively the task. Thus \nthe major problem comes to identifying the head letter in the window. \n\nConsidering it, we define the following. \n\n\u2022 Regard window images as x's, and define X accordingly. \n\u2022 For a (x, x) E X x X, denote by B a black point in the left area from the boundary on \nwindow image X. Project each B onto window image x. Then, measure the Euclidean \ndistance 6 between fj and a black point B on x being the closest to B. Let d(x, x) be \nthe summation of 6's for all black points B's on x divided by the number of B's. \n\n\u2022 Regard couples of the \"reading\" and the position of boundary as y's, and define Y \n\naccordingly. \n\nAn operator teaches the recognizer in interaction the relation between window image and \nreading& boundary with algorithm 3. Precisely, if the recalled reading is incorrect, the \noperator teaches a correct reading via the console. Moreover, if the boundary position is \nincorrect, he teaches a correct position via the mouse. \n\nFig. 1 shows partially a document image used in this experiment. Fig. 3 shows the \nchange of the number of nodes and that of the recognition rate defined as the relative \nfrequency of correct answers in the past 1000 trials. Speciiications of the window are height \n= 20dot, width = 10dot, and slant angular = 68deg. In this example, the levels of tree \nwere distributed in 6-19 at time 4000 and the recognition rate converged to about 74%. \nExperimentally, the recognition rate converges to about 60-85% in most cases, and to 95% at \na rare case. However, it does not attain 100% since, e.g., \"c\" and \"e\" are not distinguishable \nbecause of excessive lluctuation in writing. If the consistency of the x, y-relation is not \nassured like this, the number of nodes increases endlessly (d. Fig. 3). Hence, it is clever to \nstop the learning when the recognition rate attains some upper limit. To improve further \nthe recognition rate, we must consider the spelling of words. It is one of future subjects. \n\n\f771 \n\nOBSTACLE AVOIDING MOVEMENT \n\nVarious systems of camera type autonomous mobile robot are reported flourishingly6-1O. \nThe system made up by the authors (Fig. 4) also belongs to this category. Now, in math(cid:173)\nematical methodologies, we solve usually the problem of obstacle avoiding movement as \na cost minimization problem under some cost criterion established artificially. Contrarily, \nthe self-organization of associative database reproduces faithfully the cost criterion of an \noperator. Therefore, motion of the robot after learning becomes very natural. \n\nNow, the length, width and height of the robot are all about O.7m, and the weight is \nabout 30kg. The visual angle of camera is about 55deg. The robot has the following three \nfactors of motion. It turns less than \u00b130deg, advances less than 1m, and controls speed less \nthan 3km/h. The experiment was done on the passageway of wid th 2.5m inside a building \nwhich the authors' laboratories exist in (Fig. 5). Because of an experimental intention, we \narrange boxes, smoking stands, gas cylinders, stools, handcarts, etc. on the passage way at \nrandom. We let the robot take an image through the camera, recall a similar image, and \ntrace the route preliminarily recorded on it. For this purpose, we define the following. \n\n\u2022 Let the camera face 28deg downward to take an image, and process it through a low \npass filter. Scanning vertically the filtered image from the bottom to the top, search \nthe first point C where the luminance changes excessively. Then, su bstitu te all points \nfrom the bottom to C for white, and all points from C to the top for black (Fig. 6). \n(If no obstacle exists just in front of the robot, the white area shows the ''free'' area \nwhere the robot can move around.) Regard binary 32 x 32dot images processed thus \nas x's, and define X accordingly. \n\n\u2022 For every (x, x) E X x X, let d(x, x) be the number of black points on the exclusive-or \n\nimage between x and X. \n\n\u2022 Regard as y's the images obtained by drawing routes on images x's, and define Y \n\naccordingly. \n\nThe robot superimposes, on the current camera image x, the route recalled for x, and \ninquires the operator instructions. The operator judges subjectively whether the suggested \nroute is appropriate or not. In the negative answer, he draws a desirable route on x with the \nmouse to teach a new y to the robot. This opera.tion defines implicitly a sample sequence \nof (x, y) reflecting the cost criterion of the operator. \n\n.::l\" ! Roan \n\n, \n\nI \n\n~ \n\n23 \n\n22 \n\n- _ . -\n\n-\n\n11 \n\nRoan \n\n12 \n\n{- 13 \n\nIibUBe \n\n14 \n\ny \n\nrmbi Ie unit (robot) \n\nStationary uni t \n\nFig. 4. Configuration of \nautonomous mobile robot system. \n\nFig. 5. Experimental \nenvironment. \n\n24 \n\nNorth \n\nt \n\n\f772 \n\nWall \n\nPreprocessing \n\nCamera image \n\nPreprocessing 0 \n\nO Course \n\nsuggest ion \n\nA \n\n: : : !fa \u2022 \u2022\u2022 \n. . \n\nA \n\nFig. 6. Processing for \nobstacle avoiding movement. \n\nSearch \n\nx \nFig. 1. Processing for \nposition identification. \n\nWe define the satisfaction rate by the relative frequency of acceptable suggestions of \nroute in the past 100 trials. In a typical experiment, the change of satisfaction rate showed \na similar tendency to Fig. 3, and it attains about 95% around time 800. Here, notice that \nthe rest 5% does not mean directly the percentage of collision. (In practice, we prevent the \ncollision by adopting some supplementary measure.) At time 800, the number of nodes was \n145, and the levels of tree were distributed in 6-17. \n\nThe proposed method reflects delicately various characters of operator. For example, a \nrobot trained by an operator 0 moves slowly with enough space against obstacles while one \ntrained by another operator 0' brushes quickly against obstacles. This fact gives us a hint \non a method of printing \"characters\" into machines. \n\nPOSITION IDENTIFICATION \n\nThe robot can identify its position by recalling a similar landscape with the position data \nto a camera image. For this purpose, in principle, it suffices to regard camera images and \nposition data as x's and y's, respectively. However, the memory capacity is finite in actual \ncompu ters. Hence, we cannot but compress the camera images at a slight loss of information. \nSuch compression is admittable as long as the precision of position identification is in an \nacceptable area. Thus, the major problem comes to find out some suitable compression \nmethod. \n\nIn the experimental environment (Fig. 5), juts are on the passageway at intervals of \n3.6m, and each section between adjacent juts has at most one door. The robot identifies \nroughly from a surrounding landscape which section itself places in. And, it uses temporarily \na triangular surveying technique if an exact measure is necessary. To realize the former task, \nwe define the following . \n\n\u2022 Turn the camera to take a panorama image of 360deg. Scanning horizontally the \ncenter line, substitute the points where the luminance excessively changes for black \nand the other points for white (Fig. 1). Regard binary 360dot line images processed \nthus as x's, and define X accordingly . \n\n\u2022 For every (x, x) E X x X, project each black point A on x onto x. And, measure the \nEuclidean distance 6 between A and a black point A on x being the closest to A. Let \nthe summation of 6 be S. Similarly, calculate S by exchanging the roles of x and X. \nDenoting the numbers of A's and A's respectively by nand n, define \n\n\fd(x, x) = ~(~ + ~). \n\n2 n \n\nn \n\n773 \n\n(4) \n\n\u2022 Regard positive integers labeled on sections as y's (cf. Fig. 5), and define Y accord(cid:173)\n\ningly. \n\nIn the learning mode, the robot checks exactly its position with a counter that is reset pe(cid:173)\nriodically by the operator. The robot runs arbitrarily on the passageways within 18m area \nand learns the relation between landscapes and position data. (Position identification be(cid:173)\nyond 18m area is achieved by crossing plural databases one another.) This task is automatic \nexcepting the periodic reset of counter, namely, it is a kind of learning without teacher. \n\nWe define the identification rate by the relative frequency of correct recalls of position \ndata in the past 100 trials. In a typical example, it converged to about 83% around time \n400. At time 400, the number of levels was 202, and the levels oftree were distributed in 5-\n22. Since the identification failures of 17% can be rejected by considering the trajectory, no \npro blem arises in practical use. In order to improve the identification rate, the compression \nratio of camera images must be loosened. Such possibility depends on improvement of the \nhardware in the future. \n\nFig. 8 shows an example of actual motion of the robot based on the database for obstacle \navoiding movement and that for position identification. This example corresponds to a case \nof moving from 14 to 23 in Fig. 5. Here, the time interval per frame is about 40sec. \n\n,~. ~ I' \n\n. ( \" \n\n;~\"i.. \n\" ~ \n\n\" \n..I \n\u2022 \n\n. \nI \n\n\u2022 \n\n; i \n\n. \nt \n'.1 \n\n-: \n\n, . . , ' I I \n\nFig. 8. Actual motion of the robot. \n\n\f774 \n\nCONCLUSION \n\nA method of self-organizing associative databases was proposed with the application to \nrobot eyesight systems. The machine decomposes a global structure unknown into a set of \nlocal structures known and learns universally any input-output response. This framework \nof problem implies a wide application area other than the examples shown in this paper. \n\nA defect of the algorithm 3 of self-organization is that the tree is balanced well only \nfor a subclass of structures of f. A subject imposed us is to widen the class. A probable \nsolution is to abolish the addressing rule depending directly on values of d and, instead, to \nestablish another rule depending on the distribution function of values of d. It is now under \ninvestigation. \n\nREFERENCES \n\n1. Hopfield, J. J. and D. W. Tank, \"Computing with Neural Circuit: A Model/' \n\nScience 233 (1986), pp. 625-633. \n\n2. Rumelhart, D. E. et al., \"Learning Representations by Back-Propagating Er(cid:173)\n\nrors,\" Nature 323 (1986), pp. 533-536. \n\n3. Hull, J. J., \"Hypothesis Generation in a Computational Model for Visual Word \n\nRecognition,\" IEEE Expert, Fall (1986), pp. 63-70. \n\n4. Kurtzberg, J. M., \"Feature Analysis for Symbol Recognition by Elastic Match(cid:173)\n\ning,\" IBM J. Res. Develop. 31-1 (1987), pp. 91-95. \n\n5. Wang, Q. R. and C. Y. Suen, \"Large Tree Classifier with Heuristic Search and \nGlobal Training,\" IEEE Trans. Pattern. Anal. & Mach. Intell. PAMI 9-1 \n(1987) pp. 91-102. \n\n6. Brooks, R. A. et al, \"Self Calibration of Motion and Stereo Vision for Mobile \n\nRobots,\" 4th Int. Symp. of Robotics Research (1987), pp. 267-276. \n\n7. Goto, Y. and A. Stentz, \"The CMU System for Mobile Robot Navigation,\" 1987 \n\nIEEE Int. Conf. on Robotics & Automation (1987), pp. 99-105. \n\n8. Madarasz, R. et al., \"The Design of an Autonomous Vehicle for the Disabled,\" \n\nIEEE Jour. of Robotics & Automation RA 2-3 (1986), pp. 117-125. \n\n9. Triendl, E. and D. J. Kriegman, \"Stereo Vision and Navigation within Build(cid:173)\n\nings,\" 1987 IEEE Int. Conf. on Robotics & Automation (1987), pp. 1725-1730. \n\n10. Turk, M. A. et al., \"Video Road-Following for the Autonomous Land Vehicle,\" \n\n1987 IEEE Int. Conf. on Robotics & Automation (1987), pp. 273-279. \n\n\f", "award": [], "sourceid": 1, "authors": [{"given_name": "Hisashi", "family_name": "Suzuki", "institution": null}, {"given_name": "Suguru", "family_name": "Arimoto", "institution": null}]}