Text extraction


While reviewing the reported quality feedback, we identified business documents where the value is visible to human eyes but was not captured. This resulted in default values being populated to the business document data file (XML). The default value can be blank or dummy value and defined in the implementation guide (SVG/DCG).

The expected value is added on top of the invoice file (PDF) as an additional layer, for example, a comment. Additionally, the information can be inside an image, while the rest of the PDF file is machine-readable (metadata). In automatic processing, PDF comments and/or image based information is not possible to extract from the business document PDF file to the data file (XML).


The supplier did not have the purchase order number (PO) available when the invoice was created on the supplier's own invoicing system. The invoice is sent to the buyer, who added the PO as a comment to the PDF file and then emailed the PDF file to the data capturing service. The invoice layout was templated in the data capture's automatic Gateway system, which only reads the technical data file (metadata). The comment layer is not part of the PDF metadata, so the PO is not extracted to the data file (XML). A PDF file is not editable without special software; comments can be placed with certain PDF programs.

Customer actions

In a scenario where the value is not provided by the supplier on the invoice itself, then make the changes in your invoice processing system rather than placing comments in the PDF file.

Basware actions

This error does not require any actions from Basware.

How can I ask questions or raise suggestions?

If there are any further inquiries related to the "Text extraction" finding or if you have suggestions, we welcome your feedback. Please contact Basware Support by filing the following support case.

For a comprehensive summary of the renewed quality feedback process and the selected improvements that have been applied, please review the Data capture feedback analysis process.