Part of Advances in Neural Information Processing Systems 4 (NIPS 1991)
Gale L. Martin, Mosfeq Rashid
This paper describes an approach, called centered object integrated seg(cid:173) mentation and recognition (COISR). for integrating object segmenta(cid:173) tion and recognition within a single neural network. The application is hand-printed character recognition. 1\vo versions of the system are described. One uses a backpropagation network that scans exhaus(cid:173) tively over a field of characters and is trained to recognize whether it is centered over a single character or between characters. When it is centered over a character, the net classifies the cnaracter. The approach is tested on a dataset of hand-printed digits. Vel)' low errOr rates are reported. The second version, COISR-SACCADE, avoids the need for exhaustive scans. The net is trained as before. but also is trained to compute ballistic 'eye' movements that enable the input window to jump from one character to the next.
The common model of visual processing includes multiple, independent stages. First, flltering operations act on the raw image to segment or isolate and enhance to-be-re(cid:173) cognized clumps. These clumps are normalized for factors such as size, and sometimes simplified further through feature extraction. The results are then fed to one or more classifiers. The operations prior to classification simplify the recognition task. Object segmentation restricts the number of features considered for classification to those as(cid:173) sociated with a single object, and enables normalization to be applied at the individual object level. Without such pre-processing. recognition may be an intractable problem. However, a weak point of this sequential stage model is that recognition and segmenta(cid:173) tion decisions are often inter-dependent. Not only does a correct recognition decision depend on first making a correct segmentation decision, but a correct segmentation decision often depends on first making a correct recognition decision.
This is a particularly serious problem in character recognition applications. OCR sys(cid:173) tems use intervening white space and related features to segment a field of characters into individual characters, so that classification can be accomplished one character at a time. This approach fails when characters touch each other or when an individual character is broken up by intervening white space. Some means of integrating the seg(cid:173) mentation and recognition stages is needed.
This paper descnbes an approach, called centered object integrated segmentation and rec(cid:173) ognition (COISR), for integrating character segmentation and recognition within one
• Also with Eastman Kodak Company