{"title": "A Computational Model for Cursive Handwriting Based on the Minimization Principle", "book": "Advances in Neural Information Processing Systems", "page_first": 727, "page_last": 734, "abstract": null, "full_text": "A Computational Model \nfor Cursive Handwriting \n\nBased on the Minimization Principle \n\nYasuhiro Wada * Yasuharu Koike \n\nEric Vatikiotis-Bateson \n\nMitsuo Kawato \n\nA TR Human Infonnation Processing Research Laboratories \n2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-02, Japan \n\nABSTRACT \n\nWe propose a trajectory planning and control theory for continuous \nmovements such as connected cursive handwriting and continuous \nnatural speech. \nIts hardware is based on our previously proposed \nforward-inverse-relaxation neural network (Wada & Kawato, 1993). \nComputationally, its optimization principle is the minimum torque(cid:173)\nchange criterion. Regarding the representation level, hard constraints \nsatisfied by a trajectory are represented as a set of via-points extracted \nfrom a handwritten character. Accordingly, we propose a via-point \nestimation algorithm that estimates via-points by repeating the \ntrajectory formation of a character and the via-point extraction from the \ncharacter. \nIn experiments, good quantitative agreement is found \nbetween human handwriting data and the trajectories generated by the \ntheory. Finally, we propose a recognition schema based on the \nmovement generation. We show a result in which the recognition \nschema is applied to the handwritten character recognition and can be \nextended to the phoneme timing estimation of natural speech. \n\n1 \n\nINTRODUCTION \n\nIn reaching movements, trajectory formation is an ill-posed problem because the hand \ncan move along an infinite number of possible trajectories from the starting to the target \npoint. However, humans move an arm between two targets along consistent one of an \n\n>II Present Address: Systems Lab., Kawasaki Steel Corporation, \n\nMakuhari Techno Garden, 1-3.Nakase, Mihama-ku, Chiba 261, Japan \n\n727 \n\n\f728 \n\nWada, Koike, Vatikiotis-Bateson, and Kawato \n\ninfinite number of trajectories. Therefore, the brain should be able to compute a unique \nsolution by imposing an appropriate criterion to the ill-posed problem. Especially, a \nsmoothness performance index was intensively studied in this context. \nFlash & Hogan (1985) proposed a mathematical model, the minimum-jerk model. Their \nmodel is based on the kinematics of movement, independent of the dynamics of the \nmusculoskeletal system. On the other hand, based on the idea that the objective function \nmust be related to dynamics, Uno, Kawato & Suzuki (1989) proposed the minimum \ntorque-change criterion which accounts for the desired trajectory determination. The \ncriterion is based on the theory that the trajectory of the human arm is determined so as to \nminimize the time integral of the square of the rate of torque change. They proposed the \nfollowing quadratic measure of performance. Where -rj is the torque generated by the j(cid:173)\nth actuator of M actuators, and ljis the movement time. \n\nCT = r L -\n( \n\n\" M d-r' \nJo \nj=l dt \n\n0)2 \ndt \n\n(1) \n\nHandwriting production is an attractive subject in human motor control studies. In \ncursive handwriting, a symbol must be transformed into a motor command stream. This \ntransformation process raises several questions. How can the central nervous system \n(eNS) represent a character symbol for producing a handwritten letter? By what principle \ncan motor planning be made or a motor command be produced? In this paper we propose \na handwriting model whose computational theory and representation are the same as the \nmodel in reaching movements. Our proposed computational model for cursive \nhandwriting is assumed to generate a trajectory that passes through many via-points. The \ncomputational theory is based on the minimum torque-change criterion, and a \nrepresentation of a character is assumed to be expressed as a set of via-points extracted \nfrom a handwritten character. In reaching movement, the boundary condition is given by \nthe visual information, such as the location of a cup, and the trajectory formation is based \non the minimum torque-change criterion, which is completely the same as the model of \nhandwriting (Fig. 1). However, it is quite difficult to determine the via-points in order to \nreproduce a cursive handwritten character. We propose an algorithm that can determine \nthe via-points of the handwritten character, based only on the same minimization \nprinciple and which does not use any other ad hoc information such as zero-crossing \nvelocity (Hollerbach, 1981). \n\nRepresentation \n\nComputational Hardware \n\nReaching \n(reach to \n\nthe object) \n\n.-\n\nHandwriting - . \n(write a character) \n\nVisual Information \nVia-Point \n(representation \nof character) \n\nVia-poitt Estimation \nAlgorithm \n\nn================nTheory \n\nLocation \n\nof the object \n\nt \n\n... \n\nr-l-, -~-~--0r;1::\"\"=\"\"'=~=\"\"::\"'~\"'!H;11~ ~ \njk l~t~C( \n-\n.. \n\nFigure 1: A handwriting model. \n\n\fA Computational Model for Cursive Handwriting Based on the Minimization Principle \n\n729 \n\n2 PREVIOUS WORK ON THE HANDWRITING MODEL \n\nSeveral handwriting models (Hollerbach, 1981; Morasso & Mussa-Ivaldi, 1982; Edleman \n& Flash, 1987) have been proposed. Hollerbach proposed a handwriting model based on \noscillation theory. The model basically used a vertical oscillator and a horizontal \noscillator. Morasso & Mussa-Ivaldi proposed a trajectory formation model using a spline \nfunction, and realized a handwritten character using the formation model. \nEdleman & Flash (1987) proposed a handwriting model based on snap (fourth derivative \nof position) minimization. The representation of a character was four basic strokes and a \nhandwritten character was regenerated by a combination of several strokes. However, \ntheir model was different from their theory for reaching movement. Flash & Hogan \n(1985) have proposed the minimum jerk criterion in the reaching movement. \n\n3 A HANDWRITING MODEL \n3.1 Trajectory formation neural network: \n\nForward-Inverse Relaxation Model (FIRM) \n\nFirst, we explain the trajectory formation neural network. Because the dynamics of the \nhuman arm are nonlinear, finding a unique trajectory based on the minimum torque(cid:173)\nchange criterion is a nonlinear optimization problem. Moreover, it is rather difficult. \nThere are several criticisms of previous proposed neural networks based on the minimum \ntorque-change criterion: (1) their spatial representation of time, (2) back propagation is \nessential, and (3) much time is required. Therefore, we have proposed a new neural \nnetwork, FIRM(Forward-Inverse Relaxation Model) for trajectory formation (Wada & \nKawato, 1993). This network can be implemented as a biologically plausible neural \nnetwork and resolve the above criticisms. \n\n3.2 Via-point estimation model \nEdelman & Flash (1987) have pointed out the difficulty of finding the via-points in a \nhandwritten character. They have argued two points: (1) the number of via-points, (2) a \nreason for the choice of every via-point locus. It is clear in approximation theory that a \ncharacter can be regenerated perfectly if the number of extracted via-pOints is large. \nAppropriate via-points can not be assigned according to a regular sampling rule if the \nsample duration is constant and long. Therefore, there is an infinite number of \ncombinations of numbers and via-point positions in the problem of extracting via-points \nfrom a given trajectory, and a unique solution can not be found if a trajectory reformation \ntheory is not identified. That is, it is an ill-posed problem. \nThe algorithm for assigning the via-points finds the via-points by iteratively activating \nboth the trajectory formation module (FIRM) and the via-point extraction module (Fig. \n2). The trajectory formation module generates a trajectory based on the minimum torque(cid:173)\nchange criterion using the via-points which are extracted by the via-point extraction \nmodule. The via-point extraction module assigns the via-points so as to minimize the \nsquare error between the given trajectory and the trajectory generated by the trajectory \nformation module. The via-point extraction algorithm will stop when the error between \nthe given trajectory and the trajectory generated from the extracted via-points reaches a \nthreshold. \n\n\f730 \n\nWada, Koike, Vatikiotis-Bateson, and Kawato \n\nVia-Points Extraction Module Minimum Torque-\nChange Trajectory \n.... \n\no j~l (J1 (I) -9~ta(t) dl --.. Min --\nf'IM ( . \n\n5 \n\nVia-points assignment to \ndecrease the above trajectory \nerror \n\nTrajectory Formation Module \n\n(FIRM) \n\n\u2022 \nVia-Point - Trajectory generation \n\nf'r~ (~y0 0.45 \n0.40 \n0.35 \n0 .30 '--.-----,..--...,....-~---r--r-__.-\n\nFigure 4: A result of via-point estimation in a movement with a via-point. \n\n-0.3 \n\n-0.2 \n\n-0.1 \n\n0.1 \n\n0.2 \n\n0.3 \n\n0.0 \nX[m] \n\n4.2 Performance of the handwriting model \nFig. 5 shows the case of cursive connected handwritten characters. The handwriting \nmodel can generate trajectories and velocity curves of cursive handwritten characters that \nare almost identical to human data. The estimated via-points are classified into two \ngroups. The via-points in one group are extracted near the minimum points of the \n\n\f732 \n\nWada, Koike, Vatikiotis-Bateson, and Kawato \n\n0.$2 \n\n\u2022 Eatimar.cd Via-Point \n\u2022\u2022\u2022\u2022\u2022 Trajeclary by IIICIdoI \n\n~.10 \n\n0.10 \n\n0.00 \nX(ID) \n(a) \n\n(b) \n\nFigure 5: Estimated via-points in cursive handwriting. (a) and (b) show the trajectory and \ntangential velocity profile, respectively. The via-point estimation algorithm extracts a via(cid:173)\npoint (segmentation point) between characters. \n\nvelocity profile. The via-points of the other group are assigned to positions that are \nindependent of the above points. Generally, the minimums of the velocity are considered \nto be the feature points of the movement. However, we confirmed that a given trajectory \ncan not be reproduced by using only the first group of via-points. This finding shows that \nthe second group of via-points is important. Our proposed algorithm based on the \nminimization principle can estimate points that can not be selected by any kinematic \ncriterion. Funhermore, it is important in handwritten character recognition that the via(cid:173)\npoint estimation algorithm extracts via-points between characters, that is, their \nsegmentation points. \n\n5 FROM FORMATION TO RECOGNITION \n5.1 A recognition model \nNext, we propose a recognition system using the trajectory formation model and the via(cid:173)\npoint estimation model. There are several reports in the literature of psychology which \nsuggest that the formation process is related to the recognition process. (Liberman & \nMattingly, 1985; Freyd, 1983) \n\nHere, we present a pattern recognition model that strongly depends on the handwriting \nmodel and the via-point estimation model (Fig.6). (1) The features of the handwritten \ncharacter are extracted by the via-point estimation algorithm. (2) Some of via-points are \nsegmented and normalized in space and time. Then, (3) a trajectory is regenerated by \nusing the normalized via-points. (4) A symbol is identified by comparing the regenerated \ntrajectory with the template trajectory. \n\nQJ \n\n~ \n\n.... \nE \n''= \n~ \n-E \n.5.: \no.~ \nc..\"\"\" \nIQ \n.!~ \n;> \n\nRecognizer \n\n~ (Reformation \n\n& Comparison) \n\n~Ymb' \n\nFigure 6: Movement pattern recognition using extracted via-points obtained through \nmovement pattern generator \n\n\fA Computational Model for Cursive Handwriting Based on the Minimization Principle \n\n733 \n\n1 :BAD : (0,17) (18,35) (36,52) \n2 :BAD : (0,18) (18,35) (36,52) \n3 :BAD : (0,17) (18,35) (35,52) \n\nrItwz- 1 :DEAR : (0,8) (9,18) (19,31) (30,51) \n\n2 :DEAR : (0,8) (9,18) (19,31) (30,50) \n3 :DEAR : (0,8) (9,18) (19,30) (30,51) \n\nFigure 7: Results of character recognition \n\n5.2 Performance of the character recognition model \nFig. 7 shows a result of character recognition. The right-hand side shows the recognition \nresults for the left-hand side. The best three candidates for recognition are listed. \nNumerals in parentheses show the number of starting via-points and the final via-point \nfor the recognized character. \n\n5.3 Performance of the estimation of timing of phonemes in real speech \nFig. 8 shows the acoustic waveform, the spectrogram, and the articulation movement \nwhen the sentence\" Sam sat on top of the potato cooker ... \" is spoken. The phonemes are \nidentified, and the vertical lines denote phoneme midpoints. White circles show the via(cid:173)\npoints estimated by our proposed algorithm. Rather good agreement is found between the \nestimated via-points and the phonemes. \nFrom this experiment, we can point out two important possibilities for the estimation \nmodel of phoneme timing. The first possibility concerns speech recognition, and the \nsecond concerns speech data compression. It seems possible to extend the via-point \nestimation algorithm to speech recognition if a mapping from acoustic to articulator \nmotion is identified (Shirai & Kobayashi, 1991, Papcun et al., 1992). Furthermore, with \ntraining of a forward mapping from articulator motion to acoustic data (Hirayama et al., \n1993), the via-point estimation model can be used for speech data compression. \n6 SUMMARY \nWe have proposed a new handwriting model. In experiments, good qualitative and \nquantitative agreement is found between human handwriting data and the trajectories \ngenerated by the model. Our model is unique in that the same optimization principle and \nhard constraints used for reaching are also used for cursive handwriting. Also, as \nopposed to previous handwriting models, determination of via-points is based on the \noptimization principle and does not use a priori knowledge. \nWe have demonstrated two areas of recognition, connected cursive handwritten character \nrecognition and the estimation of phoneme timing. We incorporated the formation model \ninto the recognition model and realized the recognition model suggested by Freyd (1983) \nand Liberman and Mattingly(1985). The most important point shown by the models is \nthat the human recognition process can be realized by specifying the human formation \nprocess. \nREFERENCES \nS. Edelman & T. Flash (1987) A Model of Handwriting. Bioi. Cybern. ,57,25-36. \n\n\f734 \n\nWada, Koike, Vatikiotis-Bateson, and Kawato \n\n... n.~\"'~fl> cooker ... \n\nFigure 8: Estimation result of phoneme time. Temporal acoustics and vertical \npositions of the tongue blade (TBY),tongue tip (TTY), jaw (lY), and lower lip (LL Y) \nare shown with overlaid via-point trajectories. Vertical lines correspond to acoustic \nsegment centers; 0 denotes via-points. \n\nT. Flash, & N. Hogan (1985) The coordination of arm movements; An experimentally \nconfirmed mathematical model. Journal of Neuroscience, 5, 1688-1703. \nJ. J. Freyd (1983) Representing the dynamics of a static fonn. Memory & Cognition, 11, \n342-346. \nM. Hirayama, E. Vatikiotis-Bateson, K. Honda, Y. Koike, & M. Kawato (1993) \nPhysiologically based speech synthesis. In Giles, C. L., Hanson, S. J., and Cowan, J. D. \n(eds) Advances in Neural Information Processing Systems 5,658-665. San Mateo, CA: \nMorgan Kaufmann Publishers. \n1. M. Hollerbach (1981) An oscillation theory of handwriting. Bioi. Cybern., 39,139-156. \nA. M. Liberman & 1. G. Mattingly (1985) The motor theory of speech perception \nrevised. Cognition, 21, 1-36. \nP. Morasso, & F. A. Mussa-Ivaldi (1982) Trajectory formation and handwriting: A \ncomputational model. Bioi. Cybern. ,45, 131-142. \nJ. Papcun, J. Hochberg, T. R. Thomas, T. Laroche, J. Zacks, & S. Levy (1992) Inferring \narticulation and recognition gestures from acoustics with a neural network trained on x(cid:173)\nray microbeam data. Journal of Acoustical Society of America, 92 (2) Pt. 1. \nK. Shirai, & T. Kobayashi (1991) Estimation of articulatory motion using neural \nnetworks. Journal of Phonetics, 19, 379-385. \nY. Uno, M. Kawato, & R. Suzuki (1989) Formation and control of optimal trajectory in \nhuman arm movement - minimum torque-change model. BioI. Cybern. 61, 89-101. \nY. Wada, & M. Kawato (1993) A neural network model for arm trajectory formation \nusing forward and inverse dynamics models. Neural Networks, 6(7),919-932. \nY. Wada, & M. Kawato (1994) Long version of this paper, in preparation. \n\n\fPART VI \n\nApPLICATIONS \n\n\f\f", "award": [], "sourceid": 830, "authors": [{"given_name": "Yasuhiro", "family_name": "Wada", "institution": null}, {"given_name": "Yasuharu", "family_name": "Koike", "institution": null}, {"given_name": "Eric", "family_name": "Vatikiotis-Bateson", "institution": null}, {"given_name": "Mitsuo", "family_name": "Kawato", "institution": null}]}