Barcodes
webPDF supports the recognition and creation of barcodes ("Barcode" web service) with various common barcode formats.
API {REST}: /barcode
The mode for the "Barcode" service is set with the "add" (generation) and "detect" (recognition) parameters. Moreover, the various parameters can be used to configure the way in which the generation or recognition operation will work.
Recognition mode
In recognition mode, the web service will search for all selected barcode formats in the selected area in the PDF document. Depending on the selected option, the web service will structure the corresponding results in the form of a JSON or XML document.
While doing so is not an absolute necessity for most barcode formats, limiting the number of pages scanned and the size of the area scanned on these pages can significantly reduce the analysis time required for the recognition operation.
If there are multiple barcodes of the same format on the same page, it is extremely advisable to limit the size of the area being scanned, as the detection operation may fail otherwise. In this case, it is recommended to select a separate area for each barcode and recognize these barcodes separately.
The list of results provided in recognition mode will never contain duplicates for a scanned page. If a barcode featuring the exact same content and format is found multiple times on a page, it will only be listed once.
Generation mode
In generation mode, the web service will generate a barcode in the selected format and place it on as many pages as you want in the passed PDF document. The output document will always be a PDF document.
Supported formats
The following barcode formats are supported for both generation and recognition.
One-dimensional (linear) barcodes
One-dimensional barcodes are normally linear barcodes that are used to encode values with a sequence of bars with different thicknesses. In this type of barcode format, only this sequence is relevant, i.e., the bars’ height is not important. In fact, this is the reason why these barcodes are called one-dimensional barcodes. Accordingly,1D barcodes normally involve few, if any, requirements concerning the barcode height. In contrast, their width is subject to strict rules, as the sequence of empty spaces and bars, and their width ratio in particular, must strictly adhere to the relevant specifications without fail.
Codabar
| Character set: A-D, 0-9,6 special characters | Capacity: 16 symbols +4 optional characters as start and stop symbols | Encoded value: A1234567890A |
| The Codabar linear code was originally developed for the retail industry, but only plays a secondary role in it nowadays. Now, it is primarily used by libraries, photo labs, blood banks, and other specific businesses. Normally, the start and stop symbols provide information about the purpose of the encoded information. Thanks to its typically large spaces and bar thicknesses, Codabar remains easy to read at low resolutions, as well as in printouts with poor quality. However, the format has little information capacity and requires considerable space, rendering these advantages less useful than would be expected. |
Code 39
| Character set: A-Z, 0-9,5 special characters | Capacity: Variable | Encoded value: WEBPDF$ |
| The Code 3 of 9 barcode is named that way because of the fact that three of the nine elements (bars) used to encode a codeword are wider than the others. Code 39 can optionally be used with a check digit, but in general is already considered to be self-checking due to its codeword structure. Thanks to its large character set, its variable length, and the ease with which it can be generated, this barcode is widely used by a large number of industries, including the electronics, chemical, warehousing, and shipping industries, among others. The asterisk symbol is always used for the start and stop symbols. However, both the start and stop symbols are also often left out when entering/outputting a Code 39 barcode and are handled automatically in the background. Unfortunately, the format has a very low information density per unit of space, and its character set is more limited than that available with Code 128, for instance. |
Code 128
| Character set: 128A:A-Z, 0-9,ASCII special characters,ASCII control characters,FNC 1-4 * 128B:A-Z, a-z, 0-9,ASCII special characters,FNC 1-4 * 128C:00-99 | Capacity: Variable | Encoded value: webPDF |
| Code 128 is named that way because of the fact that is supports all 128 ASCII characters. Code 128 features a couple of special characteristics, the first being the fact that it is possible to switch between 3 different character sets within the same barcode, which provides greater information density and expands the available range of characters even further. These three character sets do not constitute independent formats, as a Code 128 barcode will normally switch between all three of them as necessary in order to encode contents in as compact a manner as possible. In addition to this, Code 128 offers 4 FNC codes. Out of these, FNC4 extends the character set by adding all LATIN-1 (ISO 8859-1) characters. This format is extremely common worldwide, particularly in the packaging and shipping industries. It features its own specialized start and stop symbols, as well as the option of generating a checksum. |
EAN-13
| Character set: 0-9 | Capacity: 13 digits | Encoded value: 5901234123457 |
| The "European Article Number 13" format is a widely used barcode format used for product labelling in the retail industry – the number 13 refers to the barcode’s maximum capacity of 13 digits. The main reason why this barcode is useful is the fact that it has a set length and strictly standardized contents. More specifically, an EAN-13 barcode consists of a GS1 country code (GS1 Prefix), a company code, a product code, and a check digit. This means that the format is very easy and quick to read, as well as to enter manually. There is also added flexibility in the fact that the country code can be replaced, for example, with an internal code at supermarkets in order to specify the relevant product’s use. |
EAN-8
| Character set: 0-9 | Capacity: 8 digits | Encoded value: 65833254 |
| The "European Article Number 8" format is a shorter version of the EAN-13 barcode – the number 8 refers to the barcode’s maximum capacity of 8 digits. The main reason why this barcode is useful is the fact that it not only has a set length, but also one that is comparably very short. The EAN-8 barcode is primarily intended for labelling products for which an EAN-13 barcode would be too long. An EAN-8 barcode consists of a GS1 country code (GS1 Prefix), a product code, and a check digit. This means that the format is very easy and quick to read, as well as to enter manually. |
UPC-A
| Character set: 0-9 | Capacity: 12 digits | Encoded value: 03600029145 |
| Much like EAN barcodes, the Universal Product Code is a format for labelling products in the retail industry. UPC is compatible with EAN codes, and is the only format accepted for product labelling in the USA and Canada. Its main difference from EAN-13 is the numbering system digit that is found at its beginning:0 - Normal UPC code2 - Products sold by weight3 - NDC (National Drug Code) and HRI (Health Related Items) codes, i.e., medical products4 - Unrestricted UPC code5 - Coupon6 - Normal UPC code7 - Normal UPC codeDigits 1, 8, and 9 are reserved for later assignment. The second through sixth digits are used to indicate the product’s manufacturer, with this information being followed by the product number and, finally, a check digit. In contrast to EAN barcodes, UPC codes are accepted worldwide and accordingly are preferred primarily by companies with international operations. The format is very easy and quick to read, as well as to enter manually. |
ITF
| Character set: 0-9 | Capacity: 14 digits | Encoded value: 98765432109213 |
| Just like UPC and EAN barcodes, Interleaved 2 of 5 (or, more properly, ITF-14) is a barcode format used by the retail industry. However, it is used primarily for labelling shipping packages and pallets. The first digit specifies the type of packaging, the next 12 digits contain the product number, and the final digit is the check digit. The format’s name comes from the way in which information is stored in it – it encodes pairs of digits, with the first digit being encoded in five bars and the second digit being encoded in the five spaces that follow these bars. The advantage of this approach is that it allows for a relatively high information density. The original ITF format did not have any character limitations, but it has also fallen into disuse. |
Two-dimensional barcodes
In two-dimensional barcodes, a value is encoded in a two-dimensional plane with the use of black and white pixels. 2D barcodes usually have a significantly greater information capacity than linear barcodes, but their higher complexity also means that, in some cases, they are considerably more prone to image errors. This, in turn, means that they need an error correction method. Both the height and width of 2D barcodes are subject to strict rules, as every pixel on the code can potentially contain important information. Accordingly, these formats very frequently involve requirements concerning the available heights and widths, width-to-height ratios, and the geometric shape of the barcode in general.
Data Matrix
| Character set: ASCII (1-255) | Capacity: Variable | Encoded value: webPDF |
| The contents in a Data Matrix code are encoded in the data region by using filled and empty cells. Depending on the selected type, these barcodes will have either a rectangular or square basic shape. Data Matrix barcodes feature a solid line at the left and bottom margins and segmented lines at the top and right margins – on one hand, this makes it possible to locate the barcode; on the other hand, it makes it possible to determine whether the barcode has been rotated. Data Matrix barcodes feature an integrated error correction mechanism based on the Reed-Solomon algorithm. This mechanism ensures that parts of the matrix can be recovered even if the code has been heavily damaged. |
When using the web service to recognize Data Matrix barcodes, it is absolutely necessary to make sure that the image area being scanned is limited to the Data Matrix barcode so that the barcode will be as centred within the area as possible. It is important to avoid sources of error such as text and other images as much as possible. In addition, Data Matrix barcodes require a "quiet zone" (a frame) around them without fail. The width of this quiet zone must be at least equal to the length of an encoding symbol’s side.Example: If a Data Matrix cell is 2 pixels by 2 pixels, the quiet zone must have a width of at least 2 pixels.
QR code
| Character set: ASCII (1-255) | Capacity: Variable | Encoded value: webPDF |
| In Quick Response Codes, information is encoded in a manner similar to that used for Data Matrix codes, with filled and empty squares being used in a basic square shape. QR codes are optimized in such a way that they can be automatically recognized and read as quickly as possible. In fact, they are a very popular way to store information (such as a web address) in a format that can be easily recognized and read by cell phones (mobile tagging). Normally, QR codes feature three position markers that make it easier for scanners to recognize the barcode and its orientation. The maximum conceivable information content for a QR code is 2,956 kB, but the actual capacity will also depend on the selected error correction level. This error correction level indicates the percentage of encoded data that it will be possible to restore with the Reed-Solomon algorithm (Low: 7%; Medium: 15%; Quartile: 25%; High: 30%). The higher the recovery percentage, the lower the remaining barcode capacity – however, higher levels also ensure that the barcode can sustain a greater amount of damage before becoming unreadable. |
Aztec
| Character set: ASCII (0-127), extended ASCII | Capacity: Variable | Encoded value: webPDF |
| In Aztec Codes, information is encoded with the use of empty and filled squares arranged concentrically around a square core. This core not only makes it possible to recognize the barcode, but also indicates its orientation. The resulting structure is reminiscent of stepped pyramids, which is where the format gets its name from. Each layer around the core is made up of two rings of encoding symbols, and the fact that each additional layer has longer sides means that it can represent more data. Layers are added outwards starting from the centre, meaning that the longer the encoded message, the more space an Aztec Code will need. Aztec barcodes feature an integrated error correction mechanism that is based on the Reed-Solomon algorithm and that can be configured to occupy any percentage of the barcode’s symbol capacity. This mechanism ensures that parts of the matrix can be recovered even if the code has been heavily damaged. To date, Aztec Codes have been used primarily to label pharmaceutical products, as well as for tickets for public transportation. |
When using the web service to recognize Aztec barcodes, it is absolutely necessary to make sure that the image area being scanned is limited to the Aztec barcode so that the barcode will be as centred within the area as possible.
PDF417
| Character set: ASCII | Capacity: Variable | Encoded value: webPDF![]() |
| The "Portable Data File 417" barcode format is used first and foremost to encode relatively large amounts of data. Each code pattern consists of 4 bars and 4 empty spaces and has a length of 17 encoding units, which is where the 417 number comes from. PDF417 barcodes can consist of 3 to 90 rows, with each individual row essentially representing a linear barcode that also contains information regarding its content, row number, etc. The fact that the individual rows are independent from each other means that PDF417 barcodes can be read by most linear scanners as well,. This sets this type of barcode apart from all other 2D barcodes, which require more complex image recognition. PDF417 barcodes feature an inner 8-level Reed-Solomon error correction algorithm, and the higher the level, the more resistant a barcode will be to damage. In addition, PDF 417 barcodes have the option of qualifying encoded content – a specific number of codewords is required for an individual codeword depending on the selection –, making it possible to make these barcodes more compact:Text - each codeword represents two letters.Byte - each 5 codewords represent 6 bytes.Numeric - up to 15 codewords represent numbers with a length of up to 44 digits. |
Swiss QR invoice
| Character set: ASCII(1-255) | Capacity: Variable | Encoded value: ![]() |
| The "QR Invoice" is a standard of the Swiss financial industry and replaces the "payment slips" commonly used there until 30.09.2022.The QR invoice combines a machine-readable QR code with human-readable invoice information and a receipt. The coded content of the QR code is almost identical to the invoice information. A QR invoice can also be recognized by the characteristic Swiss cross in the center of the barcode.The function, capacity and technical implementation of the barcode is otherwise exactly congruent with that of other QR codes and processes that would be suitable for reading a QR code can also be used for QR invoices. |
Output formats
In recognition mode, the Barcode web service has two possible output formats for the document that will contain the barcodes found. This output format can be defined with the "outputFormat" option in the parameters.
The document’s format is described by the http://schema.webpdf.de/1.0/extraction/barcode.xsd schema.
XML
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<barcodes xmlns="http://schema.webpdf.de/1.0/extraction/barcode">
<barcode type="qrcode" page="61" errorCorrectionLevel="L">
<rectangle coordinates="user" x="407.0" y="73.0" width="43.5" height="44.0"/>
<rectangle coordinates="pdf" x="407.0" y="725.0" width="43.5" height="44.0"/>
<plain>webPDF</plain>
</barcode>
</barcodes>
Every barcode element found in barcodes represents one recognized barcode, with its page and type attributes providing the page and format for it. In addition, there may be additional metainformation after the attributes. This metainformation may contain further information regarding the barcode depending on the specific barcode format involved.
The plain element contains the barcode’s decoded value, while the rectangle elements contain the position of the barcode on the corresponding page.
JSON
The structure in JSON format corresponds to the contexts of the XML structure.
{
"barcodes" : {
"barcode" : [ {
"rectangle" : [ {
"x" : 407.0,
"y" : 73.0,
"width" : 43.5,
"height" : 44.0,
"coordinates" : "user"
}, {
"x" : 407.0,
"y" : 725.0,
"width" : 43.5,
"height" : 44.0,
"coordinates" : "pdf"
} ],
"plain" : "webPDF",
"type" : "qrcode",
"page" : 61,
"errorCorrectionLevel" : "L"
} ]
}
}

