Learning to See Where and What: Training a Net to Make Saccades and Recognize Handwritten Characters

Part of Advances in Neural Information Processing Systems 5 (NIPS 1992)

Bibtex Metadata Paper


Gale Martin, Mosfeq Rashid, David Chapman, James Pittman


to integrated segmentation and This paper describes an approach recognition of hand-printed characters. The approach, called Saccade, integrates ballistic and corrective saccades (eye movements) with character recognition. A single backpropagation net is trained to make a classification decision on a character centered in its input window, as well as to estimate the distance of the current and next character from the center of the input window. The net learns to accurately estimate these distances regardless of variations in character width, spacing between characters, writing style and other factors. During testing, the system uses the net~xtracted classification and distance information, along with a set of jumping rules, to jump from character to character.

The ability to read rests on multiple foundation skills. In learning how to read, people learn how to recognize individual characters centered in the visual field. They also learn how to move their eyes along a line of text, sequentially centering the visual field on successive characters. We believe that the key to developing optical character recognition (OCR) sys(cid:173) tems that can mimic human reading capabilities, is to develop systems that can learn these and other skills in an integrated fashion. In this paper, we demonstrate that a backpropaga(cid:173) tion net can learn to naVigate along a line of handwritten characters, as well as to recognize the characters centered in its visual field. The system, called Saccade, extends the current state of the art in OCR technology by using a single classifier to accurately and efficiently locate and recognize characters, regardless of whether they touch each other or are sepa(cid:173) rate. The Saccade system was described briefly at the last NIPS conference (Martin & Ra(cid:173) shid, 1992). In this paper, we describe it mcx-e fully and report on results demonstrating its accuracy and efficiency in recognizing handwritten digits.

The Saccade system takes a cue from the ballistic and corrective saccades (eye movements) of natural vision systems. Natural saccades make it possible to efficiently move from one informative area to another by jumping. The eye typically initiates a ballistic saccade to