Optical Handwritten Text Recognition

Introduction

The recognition of off-line, continuously handwritten text is proving to be a quite challenging pattern recognition task. Although text is basically composed of characters, most traditional approaches to optical character recognition (OCR) generally fail in this task because of the extreme difficulty of segmenting continuously written text into characters.

Nevertheless, humans do accurately perform both segmentation and recognition in a seemingly effortless manner. Accurateness is achieved by ``delaying'' recognition until the highest perception level: only after having understood a written message are humans capable to ``recognize'' the constituent words, the corresponding characters and the underlying segmentations. Clearly, this streaking human ability comes from a tight cooperation of morphologic, lexical, syntactic and semantic-pragmatic knowledge to accomplish the task.

The problem of continuous handwritten text (CHT) recognition using standard continuous speech recognition technology is considered. Main advantages of this approach are:

a) system development is completely based on well understood training techniques and

b) no segmentation of sentence or line images into characters or words is required, neither in the training nor in the recognition phases.

Examples
      Phrase with a large Slope and Slant variation

      Phrase with inconsistent blank spaces between words

      A common large sentece