Optical Handwritten Text Recognition
Introduction
The recognition of off-line, continuously handwritten text is proving
to be a quite challenging pattern recognition task. Although text is
basically composed of characters, most traditional approaches to
optical character recognition (OCR) generally fail in this task
because of the extreme difficulty of segmenting continuously written
text into characters.
Nevertheless, humans do accurately perform both segmentation and recognition in a seemingly effortless manner. Accurateness is achieved by ``delaying''
recognition until the highest perception level: only after having
understood a written message are humans capable to
``recognize'' the constituent words, the corresponding characters
and the underlying segmentations. Clearly, this streaking human
ability comes from a tight cooperation of morphologic, lexical,
syntactic and semantic-pragmatic knowledge to accomplish the
task.
The problem of continuous handwritten text (CHT) recognition using standard continuous speech recognition technology
is considered. Main advantages of this approach are:
a) system
development is completely based on well understood training
techniques and
b) no segmentation of sentence or line images
into characters or words is required, neither in the training nor in
the recognition phases.
Examples
Phrase with a large Slope and Slant variation
Phrase with inconsistent blank spaces between words
A common large sentece
|