What is OCR?
The acronym OCR stands for Optical Character Recognition. Corresponding German terms are optical character recognition or OCR. OCR is mainly for fast, automated reading of printed text from scanned documents. The text-recognition should not with ICR method (the software-based reading of manuscripts) or OMR technology (electronic recording of questionnaires and Fomularen) be confused.
OCR: Optical Character Recognition (recognition of printed text)
OMR: Optical Mark Recognition (recognition of form fields)
ICR: Intelligent Character Recognition (recognition of handwriting)
OCR without Correction
With OCR software can be detected within a relatively short time a lot of text. In good original documents with normal font size and simple layout (few images, no tables, no column text, no footnotes or Kolumnentitel) we achieve detection rates of 95% and more. We can look back on years of experience in the field of OCR and were able to follow the development and perfecting this technology. We analyze your documents in advance, produce optimum scans for reading (removal of background and spots, text enhancement, contrast and brightness adjustment, page orientation detection, remove black edges) and program our OCR software to your needs adapted, so that special characters and symbols are correctly identified and displayed. We have the necessary resources to handle large quantities of documents scans in short time. Basically, a software technical readout is now possible even in manuscripts. Especially with manually completed forms and questionnaires in large numbers, a software-based data collection – for example, of registered manually digits – make our experience a lot of sense.
↑Please click to enlarge↑
OCR with Correction Service
If an accuracy of up to 95% is not sufficient or difficult documents to be processed, we have an OCR correction service. It all recognized by the OCR software as incorrectly classified letters and words a manual examination by one of our employees are subjected. Subsequently, if desired, a spelling correction of the entire text.
Even old fonts such as Fraktur and Sütterlin can be detected using special OCR software. However, in this case, manual post-corrections usually essential. But the poor quality of the often verbleichten writings or partly heavy soiling of the aged paper complicate a fully automated process and generally provide unsatisfactory results.