Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This is especially useful for example for payable invoices in order to extract information from these PDF invoices and review them before they will be forwarded to the internal invoice payable invoicing and approval process.

...

...

Info
  • Currently PIPEFORCE supports Google Document Understand as AI backend to extract data from invoices as we had the best results with this while testing multiple solutions.

  • In future we will extend this in a way, so also our built-in models and other providers can be used as well.

  • Note: In production Google Cloud with location Germany (Frankfurt) is used.

How to execute it?

In order to send a given PDF document to the Document Understand AI backend and to extract the required data, you can use the command ai.document.understand in your pipeline.

Info

Note: The command alias document.understand can also be used the same way as ai.document.understand. depends on the version which one or both is supported.

This command will load the document, will validate the given understand config parameters, then sends the document and the instructions to the AI and finally returns a JSON response with the values extracted from the document which can be further processed in the pipeline.

Here is an example how to use this command in a pipeline:

Code Block
languagejsonyaml
pipeline:
  - ai.document.understand:
      secret: google-document-understandDOCUMENT_UNDERSTANDING_GOOGLE
      provider: google
      input: $uri:drive:my-invoice.pdf
      config: {
          "projectId": "my-project",
          "location": "de",
          "processorId": TODO "3e05c9d1c5386f42"
        }

The parameters of the command are these:

  • secret
    The name of the secret to be used to connect to the Document Understand AI backend.

  • provider
    The AI backend to be used. There are different implementations possible:

    • google = Uses Google’s Document Understand cloud service (hosted in Germany if it is part of the enterprise plan).

    • aws = Use the AWS Document Understand cloud service.

    • <custom> = Uses a self-hosted AI model and the provided endpoint to solve this problem (coming soon).

  • input
    The input document to be send to the AI. Can be any PIPEFORCE URI. If no input parameter is specified, the input document is expected in the body of the pipeline.

  • config
    The configuration or prompt required by the AI backend to fulfill this document understand request and return the expected JSON. This configuration depends on the selected AI backend using the provider parameter. See the documentation of the provider. For Google Document Understand the parameters are:

    • projectId = The Google Cloud project to be used.

    • location = The location of the processor.

    • processorId = The pre-trained processor to be used for data extraction.

How to review and approve the extracted data in a form?

In case you would like to let the Document Understand result review and approve by a human user, you can use forms and provide a review step in your app or workflow.

The first step is to create a form config and set type to documentUnderstand and the location to the schema as this example shows:

Code Block
languagejson
{
	"title": "Document Understand",
	"outputtype": "$uri:pipeline:target/to/be/called/after/submit",documentUnderstand"
	"typeschema": "documentUnderstand"
	"config": ...
}

Use the output attribute in order to define the PIPEFORCE URI like a pipeline or command to be called after the data review is done and the form was submitted. This is the location where the final Document Understand JSON result will be written to.

...

$uri:property:global/app/myapp/schema/document-understand"
}

In the next step create another JSON document, the schema for the document understand in order to configure the structure for the form and the mapping from document understand result fields to form fields.

See here an example of such a form config schema including the config section for field mapping:

Code Block
languagejson
{
    "title": "Document Understand",
    "type": "documentUnderstand",
    "output": "...",
    "config": {
        "type": "invoice",
        "name": "My Invoice Field Detector",
        "fields": [
            {
                "id": "invoice_number",
                "mapping": "invoice_id",
                "label": "Invoice number"
            },
            {
                "id": "invoice_date",
                "mapping": "invoice_date",
                "label": "Invoice date"
            },
            {
                "id": "line_item",
                "mapping": "line_item",
                "label": "Invoice items",
                "columns": [
                    {
                        "id": "line_item/description",
                        "mapping": "line_item/description",
                        "label": "Description"
                    },
                    {
                        "id": "line_item/quantity",
                        "mapping": "line_item/quantity",
                        "label": "Quantity"
                    },
                    {
                        "id": "line_item/amount",
                        "mapping": "line_item/amount",
                        "label": "Amount"
                    }
                ]
            }
        ]
    }
}

The title defines the header to be displayed when the form is shown.

The type defines the type of document to be supported. This is by default invoice.

The output defines a PIPEFORCE URI where to write the final result JSON to after the form was confirmed.

The section fields defines the mapping of data fields extracted from the document to form fields. It is an array of JSON objects of these attributes:

  • id
    The custom id of the form field and thus the name of the final JSON result. This can be any ASCII name. If not specified, the value of mapping attribute will be used by default.

  • mapping
    The id of the field coming from the Document Understand AI backend. See below the documentation of the backend about the available field id’s which can be extracted and their ids.

  • label
    The label to be be displayed on the form field. If missing, the id will be used. This label can also be internationalized. See below.

Info

Since this form feature is an optional add-on, make sure the secret DOCUMENT_UNDESTANDING_GOOGLE exists in your instance. If not, contact support.

Data fields

By default the Google Document Understand AI backend is used (with server location in Germany) to extract data fields from documents. Below you can find a list of the most important fields which can be extracted from a an invoice document using this AI backend.

Mapping Field

Description

invoice_date

The date of the invoice

delivery_date

Date of delivery

invoice_id

Invoice number

purchase_order

Invoice reference

receiver_name

Invoice recipient name

receiver_address

Invoice recipient address

ship_to_address

Shipment address

supplier_name

The name of the supplier.

supplier_iban

The IBAN of the supplier.

due_date

The due date of the invoice

net_amount

The invoice amount (net).

total_amount

The total invoice amount.

currency

The currency used in the invoice.

vat/tax_rate

The tax rate used in the invoice.

line_item

A single line item in the invoice.

line_item/description

The description of the line item.

line_item/quantity

The quantity of the line item.

line_item/amount

The amount of the line item.

For a full list of field mapping ids, see the invoice processor documentation of the supported fields, go to the Google Document Understand documentation. TODO Table of custom fields: https://cloud.google.com/document-ai/docs/processors-list#processor_invoice-processor.

You additionally add any of these fields into your fields mapping config as decribed in the example above.

The output format

After the form was submitted, a JSON which contains the extracted fields and the document embedded as base64 will be created and stored at the location specified by the output path.

Here is an example of this outout document:

Code Block
languagejson
{
  "fields": {
    "invoice_number": "12345",
    "invoice_date": "01.01.2024",
    "line_item": [
      {
        "line_item/description": "Some item",
        "line_item/quantity": "2",
        "line_item/amount": "13,99"
      }
    ],
    ...
  },
  "document": {
    "filename": "myinvoice.pdf",
    "contentLength": 1234,
    "contentType": "application/pdf",
    "contentEncoding": "base64",
    "content": "base64EncodedFileContent"
  }
}

The field names will be the field.id as configured.

The document is a content reference JSON with the document data base64 encoded.

Also see: Content References (Files)

Internationalizing Labels (i18n)

It is possible to translate the labels of the form fields to different languages.

If no label is given for a field, this label value will be used as default:

Code Block
$uri:i18n:document-understand/<id> 

Whereas <id> is the id of the field. The UI expects this i18n key to exist. For example:

Code Block
$uri:i18n:document-understand/invoice_number 

In case the label starts with prefix $uri:i18n: then the value is replaced by the current i18n message key whereas the format of the i18n URI is this:

Code Block
<appName>/<contextName>/<messageKey>

So <appName> maps to the parameter app (optional), <contextName> (optional) maps to the context parameter of the command i18n.message and <messageKey> maps to the the message key inside the JSON finally returned by the command. For example if user sets this label:

Code Block
"label": "$uri:i18n:io.pipeforce.myapp/invoice/invoice_number"

This would map to a message JSON which can be returned using the command i18n.message?app=io.pipeforce.myapp&context=invoice. And inside this message JSON, the value of attribute invoice_number for the currently selected language will be returned:

Code Block
{
  "invoice_number": "Rechnungsnummer",
  ...
}

If no <appName> is given, io.pipeforce.common will be used by default.

If no <contextName> is given, the context default will be used.

So this example would expect the message entry invoice_number in a message JSON located in app io.pipeforce.common with context set to default:

Code Block
$uri:i18n:invoice_number

And this will use the context document-understand inside the default app io.pipeforce.common (since the first part in the path is missing):

Code Block
$uri:i18n:document-understand/invoice_number