| Optical Character Recognition (OCR) refers to a | | | | and skews are now common on the more |
| software technology and processes that involve | | | | advanced document scanners.Furthermore, |
| the translation of printed text into computer | | | | advanced color filter technologies may be used to |
| searchable text.Done correctly, OCR enables | | | | reduce any page background colors, in conjunction |
| users to search for and retrieve individual words | | | | with multi-light image capture technologies to |
| contained within a file or page. In addition, when a | | | | remove any shadows cast by page creases that |
| set of files is indexed, users are able to search | | | | could impact image quality or recognition |
| for keywords across an entire document library | | | | accuracy.Once document scanning and processing |
| and retrieve each page with exact precision. OCR | | | | are complete, an OCR text layer can actually be |
| enables users to execute searches in seconds, | | | | added and hidden behind each image. An additional |
| searches that once could take several hours or | | | | orientation filter can be used to ensure that the |
| days to complete.However, this technology did | | | | best image is presented to the OCR engines.To |
| not work well on older or poor quality documents | | | | achieve the highest conversion accuracy possible, |
| that contained mixed fonts or combinations of | | | | the characters in the image can be processed |
| texts and graphics. Until now!!Due to several | | | | using multi-engine OCR voting technologies that |
| recent technology advances, it is now possible to | | | | rank each character to determine the best text |
| obtain six-sigma level character accuracy from | | | | recognition fit. Then once a word is generated, it |
| these types of document collections.Although it is | | | | will be filtered through a proprietary lexicon to |
| important to keep in mind that the quality and | | | | ensure the highest quality results.Finally, this text |
| condition of the paper documents are still key | | | | can be processed utilizing sophisticated layout |
| factors in the successful OCR conversion, | | | | retention technologies to represent the image |
| dramatically improved results can be obtained by | | | | text layout, to provide the best possible text |
| enhancing the quality of the scanned image prior | | | | representation for precise search and retrieval. |
| to processing.Noise removal of borders, speckles | | | | |