Introduction
PDF417 is a multi-row, variable-length symbology with high data capacity
and error-correction capability. PDF417 has some unique features which
makes it the widely used 2D symbology. A PDF417 symbol can be read by
linear scanners, laser scanners or two-dimensional scanners. PDF417 is
capable of encoding more than 1100 bytes, 1800 text characters or 2710
digits. Large data files can be encoded into a series of linked PDF417
symbols using a standard methodology referred to as Macro PDF417.
Major features of PDF417 symbology:
Character Set – All 128 ASCII characters, All 128 extended
ASCII characters, 8-bit binary data;
Symbol Size – 3 to 90 rows, 90 to 583X in width
Bidirectional Decoding – Yes
Error Correction Level – 0 (no error correction) to 8 (the
maximum error correction level)
Additional Options – Macro PDF417, Truncated PDF417, Global
Label Identifier(GLI)
Symbol Structure
A typical PDF417 symbol contains 3 to 90 rows. Each row consists of
(from the left to right):
- Leading quiet zone
- Start Pattern
- Left row indicator symbol character
- 1 to 30 data symbol characters
- Right row indicator symbol character
- Stop pattern
- Trailing quiet zone

Each symbol character is 17-module wide which always consists 4 bars
and 4 spaces. Each symbol character represents a value ranging from 0 to
928 which is called “codewords” in the specification.
You can adjust the following parameters of a PDF417 symbol:
- Number of Rows
- Width of the unit (X dimension)
- Height of the unit (Y dimension)
- Number of Columns (or Aspect Ratio of the symbol)
Although you can adjust the number of rows and columns, the number of
symbol characters remain constant among all rows of a given symbol –
that is, a PDF417 symbol is always rectangle.
Symbol Character Encodation
Each PDF417 symbol character consists of 4 bars and 4 spaces which
totals 17 modules in width. Each bar and space can be from 1 to 6
modules in length. In theory it has 9*929 patterns. Each set of 929
patterns is called a cluster (character set). PDF417 only uses cluster
number 0, 3 and 6.
Row Encoding
Each row uses character patterns from a single cluster. Adjacent rows
use different clusters in the sequence 0, 3, 6, 0, 3, 6: Cluster number
= (row number -1 ) mod 3 ) * 3 Each row starts with a left row indicator
and ends with a right row indicator. These row indicators are characters
based on row number, total number of rows, number of columns and the
error correction level.
Compaction Mode
The data is encoded using one of three compaction modes: Text
compaction mode, which encodes alpha-numeric characters and
punctuations; Binary Compaction mode, which encodes all 8-bit
characters; Numeric Compaction mode, which achieve the highest density
by only allowing digits. The default mode is text compaction mode. Using
special codewords, the compaction mode can be switched from one from the
another.
Symbol characters with values from 900 – 928 are reserved for control
purposes. These control characters include mode latch and mode shift
codewords. Mode latch characters cause a shift to the new mode which
stays in effect until another mode switch is performed; mode shift
character allows temporary shifts to binary compaction mode from text
mode.
|
Value |
Usage |
|
0-899 |
Symbol characters used to encode actual
data; depends on the compaction mode; |
|
900 |
Mode Latch to Text Compaction |
|
901 |
Mode Latch within Binary Compaction |
|
902 |
Mode Latch within Numeric Mode |
|
913 |
Mode Shift to Binary Compaction |
|
924 |
Mode Latch within Binary Compaction |
|
925, 926, 927 |
Used for GLI interpretation |
|
922, 923, 928 |
Used for Macro PDF417 control blocks |
|
921 |
Reader Initialization (Macro PDF417) |
|
903-912, 914-920 |
Reserved for future use |
Global Label Identifier (GLI)
Currently most of PDF417 symbols are based on the default GLI 0 which
corresponds to ISO 8859-1 character set. It is possible to encode data
in other languages, such as Japanese and Chinese. This usage, however,
is not widely acknowledged. For more information about GLI allocation,
refer to AIM-USA document, Global Label Identifier (GLI) Assignments. In
this article we assume that the GLI value is 0 unless otherwise noted.
Error Correction Capacity
Each PDF417 symbol contains 2 to 512 error correction codewords
corresponding to error correction level 0 (the least) to 8 (the
highest). The actual number of error correction codewords is defined as
follows:
|
Error Correction Level |
Number of Error Correction codewords |
|
0 |
2 |
|
1 |
4 |
|
2 |
8 |
|
3 |
16 |
|
4 |
32 |
|
5 |
64 |
|
6 |
128 |
|
7 |
256 |
|
8 |
512 |
In Morovia PDF417 software products, value 9
indicates the automatic error correction selection which the program
picks the error correction level based on the data encoded.
The actual error correction codewords are
calculated using Reed Solomon techniques. The calculation is
non-trivial. The coding theory behind can be found in many coding theory
texts.
Printing Considerations
In today’s world, most of printers are pixel-based
meaning that the minimum printing unit is called a pixel (dot). The
number of pixels should be chosen to come to the closest to achieving
the nominal width; and the bar/space must be scaled exactly to the pixel
pitch of the printer being used.
A high resolution printer (with dpi >= 200 )
usually produces symbols recognized by most of scanners. This is due to
the reason that the pixel is so small that the scanner will not
misinterpret the width difference even two bars in theory having the
same width but raterized to different number of pixels. However, if you
produce barcode images first in a computer screen the enlarged it to be
printed in a printer, the actual rasterization happens in the screen
which has a relatively low resolution (72 dpi). In this case, choosing a
correct X-dimension is important. The least X-dimension should be
selected to a value which is an integer multiple the width of the pixel.
For example, in computer screen with a resolution at 72 dpi, the with of
a pixel is 1000/72 = 13.88 pixels. Since the pixel size is relatively
big, you need to choose a width which is integral times of the pixel
width as the X-dimension and Y height (for example, 13.88 mils, 27.76
mils and so on).
|