Optical Handwriten Forms Recognition

Our systems use fast and high accuracy recognition algorithms to classify numeric and alphabetic characters, extracted from the fields of handwritten forms. The use of customized models allows the system to deal with multiple languages. A linguistic postprocess can be applied to the OCR output in order to obtain phrases belonging to a context-constrained language. A reliability index is provided in both phases.


 About the technical features

Preprocessing: Fields and cells must be isolated and segmented. A number of processes are necessary: noise removal, blank detection, minimum box inclusion, resampling and parameterization.
Classification: The isolated characters are classified.
Parsing:Finally, when it is possible, a process of linguistic model adapation is applied to the sequence of characters belonging to a field. A corrected string and its reliability is provided.

Extended details can be found in the following papers:

  • J.C. Perez-Cortes, J. Arlandis, and R. Llobet.  Fast and accurate handwritten character recognition using approximate nearest neighbours search on large databases. In Intl. Workshop on Statistical Pattern Recognition (SPR-2000).
    Click here to get a PDF file:
       PDF.
  • J.C. Perez-Cortes, R. Llobet and J. Arlandis.   Stochastic Error-Correcting Parsing for OCR Post-Processing .  In Intl. Conference on Pattern Recognition (ICPR-2000).
    Click here to get a PDF file:
       PDF.

 Examples of system functionality



EXAMPLE 1: Preprocessing and classifying the chars of a field
     





EXAMPLE 2: An entire form recognition written in Catalan



EXAMPLE 3: Parser performances on some Spanish christian names
OCR output string Parser output string Final reliability
ANA ANA 0.984153
LORENA LORENA 0.978755
ESTELA ESTELA 0.975940
NOELIA NOELIA 0.974733
MARTA MARTA 0.973422
MARIOLA MARIOLA 0.973381
CELIA CELIA 0.972499
MARTIN MARTIN 0.967562
ANTONIO ANTONIO 0.966201
CRISTINA CRISTINA 0.964927
JOSE JOSE 0.909258
CDROLINA CAROLINA 0.888067
MARGARItd MARGARITA 0.875638
CARMEW CARMEN 0.873654
VISITACIBN VISITACION 0.873195
INMACULAOA INMACULADA 0.851757
VERJNICA VERONICA 0.850183
FERWANDO FERNANDO 0.848719
HARIA MARIA 0.841376
MARID MARIA 0.837561
MDRIA MARIA 0.837561
MDRTA MARTA 0.837548
AKPARO AMPARO 0.827204
MERCGDES MERCEDES 0.814163
LAJRA LAURA 0.794335
PDBLO PABLO 0.786623
FRD?CISCO FRANCISCO 0.770806
M?JOSE M.JOSE 0.767717
H?ISABEL M.ISABEL 0.742596
TOSE JOSE 0.722385
MEENCARND M.ENCARNA 0.717666
ESPERDNSD ESPERANZA 0.712162
MDZSABEL M.ISABEL 0.702045
ROSDN? ROSANA 0.700941
HONICB MONICA 0.696564
SEATRCZ BEATRIZ 0.682953
HARJA MARIA 0.671516
CAROCIAD CAROLINA 0.649014
ANAMBELEN? ANABELEN 0.638773
MCIOSE M.JOSE 0.638269
EHHA EMMA 0.631853
MORIB MARIO 0.592566
JE?OMIYO JERONIMO 0.584410
FSOEL FIDEL 0.572508
HDRTD MARTA 0.569616
JVAW JUAN 0.543422
JJTO SITO 0.511555
RWGOEL RAQUEL 0.508728
D?DBELC DAYBELI 0.485588
GGHM? GEMMA 0.420531
?JDDA GIADA 0.417031