Intelligent character recognition

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computer science, intelligent character recognition (ICR) is an advanced optical character recognition (OCR) or — rather more specific — handwriting recognition system that allows fonts and different styles of handwriting to be learned by a computer during processing to improve accuracy and recognition levels.

Most ICR software has a self-learning system referred to as a neural network, which automatically updates the recognition database for new handwriting patterns. It extends the usefulness of scanning devices for the purpose of document processing, from printed character recognition (a function of OCR) to hand-written matter recognition. Because this process is involved in recognising hand writing, accuracy levels may, in some circumstances, not be very good but can achieve 97%+ accuracy rates in reading handwriting in structured forms. Often to achieve these high recognition rates several read engines are used within the software and each is given elective voting rights to determine the true reading of characters. In numeric fields, engines which are designed to read numbers take preference, while in alpha fields, engines designed to read hand written letters have higher elective rights. When used in conjunction with a bespoke interface hub, hand-written data can be automatically populated into a back office system avoiding laborious manual keying and can be more accurate than traditional human data entry.

An important development of ICR was the invention of Automated Forms Processing in 1993. This involved a three-stage process of capturing the image of the form to be processed by ICR and preparing it to enable the ICR engine to give best results, then capturing the information using the ICR engine and finally processing the results to automatically validate the output from the ICR engine.

This application of ICR increased the usefulness of the technology and made it applicable for use with real world forms in normal business applications. Modern software applications use ICR as a technology of recognizing text in forms filled in by hand (hand-printed):

Company Products ICR Languages Supported
ABBYY ABBYY FlexiCapture

ABBYY FlexiCapture Engine

ABBYY FineReader Engine

Afrikaans, Albanian, Aymara, Azerbaijani (Latin), Basque, Bemba, Blackfoot, Breton, Bugotu, Bulgarian, Cebuano, Chamorro, Corsican, Crimean Tatar, Croatian, Crow, Czech, Dakota (Sioux), Dutch (Belgium), Dutch (Netherlands), English, Estonian, Even, Evenki, Fijian, Finnish, French, Frisian, Friulian, Galician, Ganda, German, German (Luxembourg), German (new spelling), Greek, Guarani, Hani, Hausa, Hawaiian, Hungarian, Icelandic, Indonesian, Irish, Italian, Jingpo, Karachay-balkar, Kasub, Kawa, Kazakh, Kirghiz, Kongo, Kpelle, Kumyk, Kurdish, Latin, Latvian, Lithuanian, Luba, Malagasy, Malinke, Maori, Maya, Miao, Minangkabau, Mohawk, Moldavian, Mongol, Mordvin, Nahuatl, Nivkh, Nogay, Nyanja, Ojibway, OldFrench, OldGerman, OldItalian, OldSpanish, Papiamento, Polish, Quechua, Rhaeto-Romanic, Romanian, Romany, Rundi, Russian, Rwanda, Sami (Lappish), Samoan, Scottish Gaelic, Selkup, Serbian (Latin), Slovak, Slovenian, Somali, Sotho, Spanish, Swahili, Swazi, Tagalog, Tahitian, Tok Pisin, Tongan, Tswana, Tun, Turkish, Uigur (Latin), Ukrainian, Wolof, Xhosa, Zapotec, Ido, Interlingua
Accusoft SmartZone ICR/OCR English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish, and Swedish (.NET supports all listed, ActiveX is English only)
ExperVision TypeReader

OpenRTK

English, French, German, Italian, Spanish, Portuguese, Danish, Dutch, Swedish, Norwegian, Hungarian, Polish, Simplified Chinese, Traditional Chinese, Russian, Finnish and Polynesian
I.R.I.S. Group IRISCapture Pro for Forms Latin based languages
LEADTOOLS LEADTOOLS ICR SDK Module Catalan, Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Polish, Portuguese, Spanish, Swedish
Prime Vision READ-IT READ-IT ICR Latin based languages, Hebrew, Cyrillic, Asian language support
reRecognition Kadmos
Recogniform Technologies Recogniform Desktop Reader
CharacTell SoftWriting

Taking ICR to the Next Level[edit]

Intelligent word recognition (IWR) can recognize and extract not only printed-handwritten information, but cursive handwriting as well. ICR recognizes on the character-level, whereas IWR works with full words or phrases. Capable of capturing unstructured information from every day pages, IWR is said to be more evolved than hand print ICR (according to the CCA (Committee for Capturing Abstractions)).[citation needed]

Not meant to replace conventional ICR and OCR systems, IWR is optimized for processing real-world documents that contain mostly free-form, hard-to-recognize data fields that are inherently unsuitable for ICR. This means that the highest and best use of IWR is to eliminate a high percentage of the manual entry of handwritten data and run-on hand print fields on documents that otherwise could be keyed only by humans.

See also[edit]