Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents
minLevel1
maxLevel3
outlinefalse
typelist
printablefalse

What is a Data Pipeline?

A Data Pipeline or in short just Pipeline in PIPEFORCE is a YAML script which describes the flow of data from one endpoint to another. Between these endpoints, the data can be enriched, transformed, cleansed and so on, so it becomes compatible between these integration endpoints.

Such a pipeline consists typically of one or more so called Commands to reach these goals. A command is a server-side task.

Here is an example of such a pipeline YAML script which simply downloads a JSON document and stores it in the attached cloud storage:

Code Block
languageyaml
pipeline:
  - http.get: https://somedomain.tld/rest/contracts
  - drive.save: contracts.json 

In the web portal of the enterprise version of PIPEFORCE there is also a no code editor so you can design such pipelines via drag & drop:

...

What is a Command?

A Command in PIPEFORCE is a server-side operation, which covers a single operation. It can be executed remotely via HTTP. It takes an optional input body, optional parameters, processes a certain task, and finally produces an optional output which is the response to the caller. Here is an example of a command URL which can be called via HTTP GET request:

...

You can find a all built-in commands in the commands reference.

Command Name

Each command has a unique name which is always written in lower case and follows the dot.notation. Here are some examples of valid command names in dot notation:

...

Info

COMMAND NAMES VS. REST RESOURCE NAMES

Even if many command names do have a similar resource-based semantic like HTTP GET, POST or PUT in REST do, they do not follow this approach 100%, since a Command is typically bound to a server side operation, not only to a resource operation. Therefore, the operation type of a command is defined by its name, not by a method header. For example: property.put or config.get to name just a few.

Command Alias

Besides the default command name, a command can also have one or more aliases. These are alternative names which can be used the same way as the default names. For example the command mail.send has also the alias mail. Both can be used the same way for sending emails since they point to the same command implementation.

For list of the alias names of a command, see the command documentation.

Command Parameters

Commands can have zero to many parameters. Whereas each parameter is a name-value pair. The parameters can be passed in different ways to the command, depending on the execution context you're working in. See Executing a Command below.

...

Code Block
languageyaml
pipeline:
  - log:
      message: {"key": "value"}

Default Command Parameter (short form)

Status
colourBlue
titleSince Version 9.0

...

It depends on the command whether it supports a default parameter or not. See in the command docs to find out. The default parameters are marked with (default).

Limitations

Default parameter value can only be primitive

The values of a default parameter can only be primitives like string, number or boolean. It cannot be a JSON object or array. If you need to pass a JSON, pass it as string if the command supports this. In most cases the command will auto-convert from string to JSON if required. For example:

...

Note

The default parameter value can only be primitive like string, number or boolean but no object or array.

Mixing default and ordinary parameters not allowed

Note that mixing the default parameter with ordinary parameters is not possible since YAML specification is not allowing this:

...

Note

If you use the default parameter of an command, you cannot set any other ordinary parameter on it.

Executing a Command

A single Command can be executed by sending it as HTTP GET or POST request using the endpoint /api/v3/command. The full url structure of this endpoint is always like this:

...

  • Replace HUB by the hub host name of your instance (for example hub-mycompany.pipeforce.net).

  • Replace <command.name> by the name of the command you would like to execute.

  • Replace <param1>,<value1> and <paramN>,<valueN> by the optional parameters of your command.

HTTP GET

Here is an example to execute the log command as HTTP GET request, and set its message parameter to a string value using a HTTP request parameter:

...

See HTTP Execution Reference for an summary of all supported HTTP options here.

HTTP POST

Here is an example to execute a single command as HTTP POST request, and set the message parameter to a log command using a HTTP POST data body in curl:

...

See HTTP Execution Reference for an summary of all supported HTTP options here.

CLI

You can also use the PIPEFORCE CLI in order to execute a single Command. Here is an example to call the log command and set the message parameter accordingly:

Code Block
pi command log message=HELLO

Body

Also see: Pipeline Body.

Beside parameters, a command can also consume and produce a body, similar to a HTTP POST request and response.

...

See HTTP Execution Reference for an summary of all supported HTTP options here.

...

Pipeline

...

in Detail

Two or more Commands can be chained to a flow, called a Pipeline. If such a pipeline gets executed, each command in it will be executed one after another, whereas the output message of the first command will become the input message of the next command, and so on:

...

By default, the message output (= body) of the first command (datetime in this example) will automatically become the input message (= body) of the next command (log in this example). Therefore no declaration and exchange of variables is required here. The exchange of body data between commands is implicitly.

Parameters

In case you need to configure a command by specifying parameters for it in a pipeline, you can do so by writing them below the command as name-value-pairs with an additional tab indent or a at least two spaces as indent (see YAML specification for this):

...

Code Block
pipeline:
  - datetime:
      format: "dd-MM-YYYY"
  - log

Multi-line parameters

Parameters can also be multiple lines long:

...

There are much more options on how to format line breaks in YAML. For full details, see the YAML specification: https://yaml.org/spec/1.2.2/

JSON Parameters

Furthermore, it is possible to specify a JSON as parameter, like this example shows:

...

As you can see, there is no need to escape or convert the JSON to a string. It can be placed as JSON 1:1 inside the YAML.

PEL Parameters

Beside static string values it is also possible to pass dynamic values to the parameters. This is done by using a Pipeline Expression. For example:

...

For more information about Pipeline Expressions, see section: Pipeline Expression Language (PEL)

Executing a Pipeline

Execute by HTTP request

Executing a Pipeline with HTTP is simple:

...

See HTTP Execution Reference for an summary of all supported HTTP options.

Sending the Body Message with HTTP

You can also send the pipeline body in an HTTP request. See this example:

...

Code Block
BODY: Hello World!

Execute in Portal

The portal offers an advanced online editor with syntax highlighting, code completion and debugging support, where you can write pipelines and execute them online. This is the easiest and most preferred way to ad-hoc execute and test your pipelines. Here you can see a simple pipeline after its ad-hoc execution in the online editor:

...

Execute in CLI

Another approach to execute a pipeline is by using the CLI: Command Line Interface (CLI).

Execute local pipeline file

Lets assume you have a local pipeline YAML stored at src/global/app/myapp/pipeline/test.pi.yaml inside of your PIPEFORCE workspace, then you can execute it via this CLI call (the path must start with src/):

...

NOTE

A pipeline YAML file must end in this suffix to be detected correctly by your workspace: .pi.yaml.

Execute persisted remote pipeline

In case you have stored your pipeline at server side in the Property Store, then you can execute it using this call (the path must start with global/):

...

This command searches for a property in the property store with path global/app/myapp/pipeline/test and executes it. Finally, it sends any result back to your terminal.

Pipeline Sections

Every pipeline YAML script may consist of four main sections:

...

All sections except pipeline are optional in a pipeline script. Even if not explicitly defined in the pipeline script, each scope exists implicitly. That means, you can access it and read / set values from / on it without declaring it in the pipeline. For example, by using a pipeline expression (PE).

headers

The headers section is optional. A header is a name-value pair to define "global configuration" hints and configurations for the given pipeline. Only text is allowed as content i.e. no complex objects like JSON. It is not meant to be changed during pipeline processing, even this is possible for rare cases.

...

You can read and set values in the headers section using the Pipeline Expression Language (PEL).

vars

The vars section is optional and contains transient variables as name value pairs. It is meant as a transient scope for states during the pipeline processing.

...

You can access values in the vars scope using the Pipeline Expression Language (PEL).

pipeline

The pipeline section is mandatory and lists all commands which must be executed in given order.

...

You can set dynamic parameter values on commands using the Pipeline Expression Language (PEL).

body

The body section is optional. It defines a single object to be used as “data pool” or transformation data during the pipeline processing.

...

You can access values in the body scope using the Pipeline Expression Language (PEL).

Pipeline as JSON

Sometimes it is necessary to use JSON as the pipeline definition language instead of YAML. Let's assume a pipeline written in YAML like this:

...

NOTE

Since YAML is the default definition for pipelines and it is much easier to read, you should prefer it over JSON whenever possible. The JSON variant is mainly meant for internal cases, where YAML is not possible or hard to use and it is not compatible with any use case.

Pipeline as URI

Beside YAML and JSON, a third option to define a pipeline is possible: Using a pipeline uri which is an inline version of a pipeline. This is handy in case you must define a pipeline as a “one-liner”.

...

Info

NOTE

There is also an older approach to define a pipeline in one line. This has used the pipe | symbol to separated the commands. Since this approach is not compatible with the URI syntax specification, it was dropped in favour of the approach defined in this chapter.

Auto-completion support

In order to enable auto-completion support for your pipeline YAML scripts in your local development editor, you need an editor which supports YAML schema validation. Then, you can have auto-completion which shows all available commands and their parameters:

...

Auto-completion in IntelliJ

To enable auto-completion in IntelliJ, open preferences and navigate to JSON Schema Mappings:

...

A YAML pipeline script should always end in suffix .pi.yaml which stands for stands for pipeline scripts written in YAML.

Auto-completion in Visual Studio Code

If you want to also enable code-completion for your pipeline yaml files in your VS Code editor, you need to install the YAML language support plugin from Red Hat first: https://marketplace.visualstudio.com/items?itemName=redhat.vscode-yaml

...

A local YAML pipeline script should always end in suffix .pi.yaml which stands for pipeline scripts written in YAML.

Auto-completion in the Portal

The built-in online workbench and the playground in the PIPEFORCE portal supports pipeline script completion out-of-the box.

...