XML-Bericht
Similar to PDF/A conversion, the transcribe operation provides several variants to create an error report in XML format for HTML export.
Translating PDF character instructions and content to HTML can be error-prone. Basically, the transcribe operation tries to get the most accurate result possible, but also defines different variants to provide the best possible replacement for instructions that cannot be translated directly. The error report summarizes related problems, errors and remarks and allows an evaluation of the result.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<report xsi:schemaLocation="http://schema.webpdf.de/1.0/report/transcribe http://schema.webpdf.de/1.0/report/transcribe.xsd"
xmlns="http://schema.webpdf.de/1.0/report/transcribe" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<analysis error="0" missingcontent="7" info="7" warning="0">
<detail code="0" page="1" type="info">Loading: 'F18' Font-type: 'PDType3Font'.</detail>
<!-- ... -->
</analysis>
<result>
<error>0</error>
<message></message>
</result>
</report>
The main node <report> generally contains two elements:
<analysis> = Any annotations, problems or errors found.
<result> = Error code of the performed conversion (0 = successful conversion).
Errors and corrections
The <result> node contains the result of the extraction under <error>. If it says 0, then the execution was successful.
<result>
<error>29</error>
<message>Font extraction failed for F49: (Unsupported Postscript Type3 Font without font descriptor: "F49")
</message>
</result>
If there is a number greater than 0, then a fatal error occurred. This error number is added and explained in the <message> entry.
A more detailed description and possible further hints can be taken from the <analysis> node:
<analysis error="0" missingcontent="7" info="7" warning="0">
<detail code="0" page="1" type="info">Loading: 'F18' Font-type: 'PDType3Font'.</detail>
<!-- ... -->
<detail code="29" type="missingContent">Font extraction failed for F18: (Unsupported Postscript Type3 Font without
font descriptor: "F18")
</detail>
<!-- ... -->
</analysis>
type= Severity of the problem. The assessment of errors that have occurred is carried out according to the following levels:error= A clear error has occurred that would make further processing impossible or lead to serious defects in the result.missingContent= An error occurred during the translation of a single element that could not be compensated. The resulting document will be functional, but correspondingly incomplete or different from the source.warning= An error has occurred that allows further processing but makes defects or inaccuracies in the result likely.info= Does not denote a direct error, but a note worth mentioning, which could have a relevance for the evaluation of the result.page= Page in the PDF document where the error occurred during editing (if the page is "0", then a global PDF object is affected).code= internal error code of the HTML export
The text entry of the <detail> node gives a textual description of the error.