File Digitization in various Formats and Structures

File Formats for Digital Documents

We scan books and documents in any format desired, and like to implement individual wishes. Even the creation of specific index files for import into your document management system is a routine task for DOCUBYTE. In filing scans are available on request for the transfer a large Excel checklist of all the important information. If you are not familiar with the different formats we advise you!

Contact Form

Output formats:

PDF

TIFF

JPEG

RAW

Compression methods:

CCITT Fax 4

Mixed Raster Content

LZW

JPEG2000

Compression Methods for Scanned Documents

Particularly with color scanned images can cause large amounts of data. To be able to deliver digital documents with reasonable file sizes, we use different compression methods.

JPEG and JPEG2000 are the best known and most widely used method for reducing file sizes for color scans. When scanning files we mainly use lossy compression method, since the differences in normal document scans have little impact and which significantly reduce in this way the amount of data.

Through the so-called mixed raster content compression (MRC) can in color scanned documents can be reduced to the size of black and white scans. This procedure is only possible for PDF, as various methods to reduce the file size to be used in combination. The advantage: Color information is not lost and the file sizes can be reduced to a minimum.

Black-and-white scans can be reduced with the CCITT Group 4 compression up to one twentieth of the original file size – without loss of quality. The resulting images can be embedded not only in PDF, but also in the versatile TIF format.

Scan Service – Output Structures

The output structures described below are the most common standards, which have arisen in the context of many scanning projects. This type of issue is particularly intended for archiving of scanned documents in the file system without its own document management system (DMS). For scanning projects with existing DMS we program course suitable interfaces for document scans and the corresponding index values. Basically, we are to deliver any desired data structure capable.

(1) File = PDF

 

(2) File = Directory | 1 Document = 1 PDF

 

(3) File = Directory | File Registers = Directories | 1 Document = 1 PDF

(1) File = PDF

A searchable multi-PDF file is created for each folder / file. The file naming is done according to the label on the back or folder according to customer specifications (previous sticking with files barcodes and collect the appropriate file naming required by the customer).

Optional can be inserted per register Interleave bookmarks in the PDF. Here (data acquisition manual required) are the bookmarks either consecutively numbered (automated version) or named after the respective register label.

Scanservice Data Output Version 1

Simple structure for files Scans

Scanservice Data Output Version 2

File Scans: Output Structure with document or register separation

(2) File = Directory | 1 Document = 1 PDF

A directory (folder in the file system) is created for each file. The file directories are named after the inscription of the respective folder back. Each directory always contains the individual registers or individual documents in the same order in which they were filed in paper folders. The individual registers / documents are provided with a consecutive number and outputted as searchable PDF file.

Optional capture our typist of the register marking or from the content of the documents one each meaningful name, which is then used for the designation of the individual PDF files. Naturally, it is also possible to detect defined index values for each document, e.g. File number, part number, employee number, etc.

 

(3) File = Directory | File Registers = Directories | 1 Document = 1 PDF

This third variant requires the highest effort for document separation and data acquisition and is therefore used only rarely.

a directory is created for each file. The file directories are named after the inscription of the respective folder back. In contrast to variant 2, the structure of registers is mapped as a directory in this case. Each registry directory in turn contains the correspondingly associated documents. These individual documents are provided as in variant 2 with a consecutive number and outputted as searchable PDF file.

Optional capture our typist both the Register inscription also meaningful file names for the respective documents.

Scanservice Data Output Version 3

File scan: complex structure with subfolders and document separation