What is Document Understand?

PIPEFORCE comes with a solution to automatically detect and extract data from a given (PDF) document using AI and displays those fields then on a form to the user. The user can then review and adjust the field values if required.

This is especially useful for example for payable invoices in order to extract information from these PDF invoices and review them before they will be forwarded to the internal invoice payable process.

How to execute it?

In order to send a given PDF document to the Document Understand AI backend and to extract the required data, you can use the command ai.document.understand in your pipeline.

This command will load the document, will validate the given understand config parameters, then sends the document and the instructions to the AI and finally returns a JSON response with the values extracted from the document which can be further processed in the pipeline.

Here is an example how to use this command in a pipeline:

pipeline:
  - ai.document.understand:
      secret: google-document-understand
      provider: google
      input: $uri:drive:my-invoice.pdf
      config: {
          "projectId": "my-project",
          "location":"de",
          "processorId":"3e05c9d1c5386f42"
        }

The parameters of the command are these:

secret
The name of the secret to be used to connect to the Document Understand AI backend.
provider
The AI backend to be used. There are different implementations possible:
- google = Uses Google’s Document Understand cloud service (hosted in Germany if it is part of the enterprise plan).
- aws = Use the AWS Document Understand cloud service.
- <custom> = Uses a self-hosted AI model and the provided endpoint to solve this problem.
input
The input document to be send to the AI. Can be any PIPEFORCE URI. If no input parameter is specified, the input document is expected in the body of the pipeline.
config
The configuration or prompt required by the AI backend to fulfill this document understand request and return the expected JSON. This configuration depends on the selected AI backend using the provider parameter. See the documentation of the provider. For Google Document Understand the parameters are:
- projectId = The Google Cloud project to be used.
- location = The location of the processor.
- processorId = The pre-trained processor to be used for data extraction.

How to review and approve the extracted data?

In case you would like to let the Document Understand result review and approve by a human user, you can use forms and provide a review step in your app or workflow.

The first step is to create a form config and set type to documentUnderstand and the location to the schema as this example shows:

{
	"title": "Document Understand",
	"type": "documentUnderstand"
	"schema": "$uri:property:global/app/myapp/schema/document-understand"
}

In the next step create another JSON document, the schema for the document understand in order to configure the structure for the form and the mapping from document understand result fields to form fields.

See here an example of such a schema including the config section for field mapping:

{
    "title": "Document Understand",
    "type": "documentUnderstand",
    "output": "...",
    "config": {
        "type": "invoice",
        "name": "Invoice",
        "fields": [
            {
                "id": "invoice_number",
                "mapping": "invoice_id",
                "label": "Invoice number"
            },
            {
                "id": "invoice_date",
                "mapping": "invoice_date",
                "label": "Invoice date"
            },
            {
                "id": "line_item",
                "mapping": "line_item",
                "label": "Invoice items",
                "columns": [
                    {
                        "id": "line_item/description",
                        "mapping": "line_item/description",
                        "label": "Description"
                    },
                    {
                        "id": "line_item/quantity",
                        "mapping": "line_item/quantity",
                        "label": "Quantity"
                    },
                    {
                        "id": "line_item/amount",
                        "mapping": "line_item/amount",
                        "label": "Amount"
                    }
                ]
            }
        ]
    }
}

As you can see, and data field from the document understand response is mapped to a form field inside the fields array:

id
The custom id of the form field and thus the name of the final JSON result. This can be any ASCII name. If not specified, the value of mapping attribute will be used by default.
mapping
The id of the field coming from the Document Understand AI backend. See below the documentation of the backend about the available field id’s which can be extracted and their ids.
label
The label to be be displayed on the form field. If missing, the id will be used. This label can also be internationalized. See below.

Data fields

By default the Google Document Understand AI backend is used (with server location in Germany) to extract data from documents. Below you can find a list of the most important fields which can be extracted from a document using this backend. For a full documentation of the supported fields, go to the Google Document Understand documentation.

TODO Table of custom fields