Optical Handwriten Forms Recognition
Our systems use fast and high accuracy recognition
algorithms to classify numeric and alphabetic characters,
extracted from the fields of handwritten forms. The use of
customized models allows the system to deal with multiple
languages. A linguistic postprocess can be applied to the
OCR output in order to obtain phrases belonging to a
context-constrained language. A reliability index is provided
in both phases.
About the technical features
Preprocessing: Fields and cells must be isolated and segmented.
A number of processes are necessary: noise removal, blank detection, minimum box inclusion, resampling and parameterization.
Classification: The isolated characters are classified.
Parsing:Finally, when it is possible, a process of linguistic model adapation is applied to the sequence of characters belonging to a field. A corrected string and its reliability is provided.
Extended details can be found in the following papers:
- J.C. Perez-Cortes, J. Arlandis,
and R. Llobet. Fast
and accurate handwritten character recognition using
approximate nearest neighbours search on large
databases. In Intl. Workshop on
Statistical Pattern Recognition (SPR-2000).
Click here to get a PDF file:
PDF.
- J.C. Perez-Cortes, R. Llobet and
J. Arlandis.
Stochastic Error-Correcting Parsing for OCR
Post-Processing . In Intl. Conference
on Pattern Recognition (ICPR-2000).
Click here to get a PDF file: PDF.
Examples of system functionality
EXAMPLE 1: Preprocessing and classifying the chars of a field
EXAMPLE 2: An entire form recognition written in Catalan

EXAMPLE 3: Parser performances on some Spanish christian names
| OCR output string |
Parser output string |
Final reliability |
| ANA |
ANA |
0.984153 |
| LORENA |
LORENA |
0.978755 |
| ESTELA |
ESTELA |
0.975940 |
| NOELIA |
NOELIA |
0.974733 |
| MARTA |
MARTA |
0.973422 |
| MARIOLA |
MARIOLA |
0.973381 |
| CELIA |
CELIA |
0.972499 |
| MARTIN |
MARTIN |
0.967562 |
| ANTONIO |
ANTONIO |
0.966201
|
| CRISTINA |
CRISTINA |
0.964927
|
| JOSE |
JOSE |
0.909258
|
| CDROLINA |
CAROLINA |
0.888067
|
| MARGARItd |
MARGARITA |
0.875638
|
| CARMEW |
CARMEN |
0.873654
|
| VISITACIBN |
VISITACION |
0.873195
|
| INMACULAOA |
INMACULADA |
0.851757
|
| VERJNICA |
VERONICA |
0.850183
|
| FERWANDO |
FERNANDO |
0.848719
|
| HARIA |
MARIA |
0.841376
|
| MARID |
MARIA |
0.837561
|
| MDRIA |
MARIA |
0.837561
|
| MDRTA |
MARTA |
0.837548
|
| AKPARO |
AMPARO |
0.827204
|
| MERCGDES |
MERCEDES |
0.814163
|
| LAJRA |
LAURA |
0.794335
|
| PDBLO |
PABLO |
0.786623
|
| FRD?CISCO |
FRANCISCO |
0.770806
|
| M?JOSE |
M.JOSE |
0.767717
|
| H?ISABEL |
M.ISABEL |
0.742596
|
| TOSE |
JOSE |
0.722385 |
| MEENCARND |
M.ENCARNA |
0.717666
|
| ESPERDNSD |
ESPERANZA |
0.712162
|
| MDZSABEL |
M.ISABEL |
0.702045
|
| ROSDN? |
ROSANA |
0.700941
|
| HONICB |
MONICA |
0.696564
|
| SEATRCZ |
BEATRIZ |
0.682953
|
| HARJA |
MARIA |
0.671516
|
| CAROCIAD |
CAROLINA |
0.649014
|
| ANAMBELEN? |
ANABELEN |
0.638773
|
| MCIOSE |
M.JOSE |
0.638269
|
| EHHA |
EMMA |
0.631853
|
| MORIB |
MARIO |
0.592566
|
| JE?OMIYO |
JERONIMO |
0.584410
|
| FSOEL |
FIDEL |
0.572508
|
| HDRTD |
MARTA |
0.569616
|
| JVAW |
JUAN |
0.543422
|
| JJTO |
SITO |
0.511555 |
| RWGOEL |
RAQUEL |
0.508728 |
| D?DBELC |
DAYBELI |
0.485588 |
| GGHM? |
GEMMA |
0.420531 |
| ?JDDA |
GIADA |
0.417031 |
|