Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 29 Next »

What is JSON Data Mapping?

Data Mapping in PIPEFORCE means you have two JSON documents: The source and the target and you would like to write fields from the source JSON to the target JSON. This is called mapping since you map from source fields to target fields. For each such field mapping you can define a rule which defines how to map and optionally contains additional instructions in order to prepare the to be mapped data before it gets written to its target location. For example validate it or convert the text to upper case.

Mapping with command data.mapping

To simplify the mapping, the command data.mapping can be used.

It takes a list of mapping rules and applies them on the source JSON (= input) in order to write data to the target JSON (= output). Also additional expressions can be added to each rule. These expressions are Pipeline Expressions and Pipeline Expression Utils.

Let's see an example first:

body: {
        "firstName": "Max  ",
        "lastName": "smith",
        "age": 48,
        "birthDate": "01/12/1977",
        "hobbies": ["hiking", "biking"],
        "type": "customer"
      }

pipeline:
    - data.mapping:
        rules: |
            body.firstName   -> person.firstName,
            body.lastName    -> person.surname,
            body.age         -> person.age,
            body.birthDate   -> person.dateOfBirth,
            body.hobbies     -> person.hobbies,
            body.type        -> person.type

This example sets a JSON document in the initial body at the top. Note: This JSON can be loaded from any location instead.

Then it applies the given mapping rules from left to right, and writes by default the result as a new JSON in the body (replacing the initial JSON).

As you can see, every mapping rule is placed in a separate line, each ending with a comma, except for the last one.

The left part of the mapping rule (left side of the arrow) is the input path (where to read the data from). The right part of the mapping rule (right side of the arrow) is the output path (where to write the data to):

inputPath -> outputPath

All mapping rules in inputPath are relative to the context given by input parameter. By default, this value is the current pipeline message which contains vars, headers and body as root items. So in your mapping, you can access all of them. For example to map from a pipeline variable, you could write a mapping rule like this:
vars.myVariable -> target

The output parameter points to the location, where the mapping results should be written to. This is by default the body of the pipeline.

The final mapping result in the body from the example above will look like this:

{
    "person": {
        "firstName": "Max  ",
        "surname": "smith",
        "age": 48,
        "dateOfBirth": "01/12/1977",
        "hobbies": [
            "hiking",
            "biking"
        ],
        "type": "customer"
    }
}

As you can see, the applied mapping rules resulted in these changes:

  • The input field firstName was nested inside the new element person. The field name firstName was not changed.

  • The input field lastName was also mapped to the nested element person. Additionally it was renamed from firstName to surname.

  • The field age was nested inside person without any change.

  • And the input field birthDate was nested inside person and renamed to dateOfBirth.

  • The field type was only nested inside person.

Using advanced Pipeline Expression in mapping rules

Now lets assume we would like to add some functions to the values in parallel to the mapping. You can do so by applying Pipeline Expressions on the input path. For example:

body: {
        "firstName": "Max  ",
        "lastName": "smith",
        "age": 48,
        "birthDate": "01/12/1977",
        "hobbies": ["hiking", "biking"],
        "type": "customer"
     }

pipeline:
    - data.mapping:
        rules: |
            @text.trim(body.firstName)           -> person.firstName,
            @text.firstCharUpper(body.lastName)  -> person.surname,
            body.age                             -> person.age,
            body.birthDate                       -> person.dateOfBirth,
            body.hobbies[0]                      -> person.primaryHobby,
            @text.upperCase(body.type)           -> person.type,
            body.age > 18                        -> person.adult,
            "male"                               -> person.gender,
            @data.emptyList()                    -> person.myList,
            @data.emptyObject()                  -> person.myObject

As you can see, there is no need to wrap each expression inside ${ and } since each rule part in the data mapping is automatically treated as a PE. This makes it much easier to read the mapping rules.

The result JSON of this pipeline after execution will look like this:

{
    "person": {
        "firstName": "Max",
        "surname": "Smith",
        "age": 48,
        "dateOfBirth": "01/12/1977",
        "primaryHobby": "hiking",
        "type": "CUSTOMER",
        "adult": true,
        "gender": "male",
        "myList": [],
        "myObject": {}
    }
}

As you can see, the nested mapping below person was kept. Additionally:

  • the field firstName was trimmed from whitespaces

  • the field surname contain now the first char upper case

  • the first item of the array hobbies was selected and set to new element person.primaryHobby.

  • the field type was converted to upper case and a new field person.adult was added with the result of the expression age > 18

  • the constant string male was set to the new field person.gender

  • a new, empty list was added in new field person.myList

  • a new, empty object was added in new field person.myObject.

By default the mapping result gets written to the body. If you would like write to a variable instead, you can use the output parameter:

vars:
    mappingResult: null
pipeline:
    - data.mapping:
        rules: |
            ...
        output: ${vars.mappingResult}

Make sure that the output target was created before.

Contextualize the mapping data

By default the input data is provided directly to each mapping rule so it can be very easily accessed like this example shows:

vars:
  data: {"name": "someAttr", "value": "someValue"}
  
pipeline:
  - data.mapping: 
      input: ${vars.data}
      rules: |
        name -> result.name,
        value -> result.value

As you can see, on each mapping rule you can directly access the attributes of the input JSON via name and value variable.

In some situations you need more information on each mapping rule like the result of an inputPath, the current iteration index on case of an iteration and more. In this case, you can set the parameter contextualize to true.

If contextualize is enabled, an mapping context is provided on each mapping rule which gives you read access to these additional attributes:

  • vars = The variables of the pipeline.

  • headers = The headers of the pipeline.

  • body = The current body of the pipeline.

  • item = The current iteration item if this is an iteration mapping. Otherwise returns the current input data.

  • index = The current (0 based) iteration index if this is an iteration mapping. Otherwise returns -1.

  • selection = The current selection result after the inputPath expression has been evaluated (= the value from the input to be written to the output). Note: This is only available in outputPath (right side).

Note: If contextualize is enabled, the current iteration item will be replaced by the context object. In order to still access the current iteration now you have to use the item. prefix instead. See the example below.

vars:
  data: {"name": "someAttr", "value": "someValue"}
  
pipeline:
  - data.mapping: 
      input: ${vars.data}
      contextualize: true
      rules: |
        item.name -> result.name,
        item.value -> result.value

Dynamic Output Path

The outputPath (right side of the mapping rule) is usually a fixed path to describe where to write the calculated selection from the inputPath rule to.

Sometimes it is required to create this outputPath dynamically based on the input data.

To do so, you have to set contextualize: true. This will provide you a mapping context to each mapping rule which contains the current selection. This can be used to create a dynamic output path instead. See this example:

vars:
  data: {"name": "someAttr", "value": "someValue"}
  
pipeline:
  - data.mapping: 
      contextualize: true
      input: ${vars.data}
      rules: |
        item.value -> ${item.name}

As you can see:

  • Any input data is now accessed using the prefix item. This is because if contextualize is set to true, the mapping context is provided which adds multiple tooling variables in order to simplify mapping. Each variable has its own “namespace” to not collide with the input data.

  • The outputPath now uses a PEL in order to access the mapping context and dynamically set the name for the output attribute.

The final JSON result is this:

{
  "someAttr": "someValue"
}

Conditional mapping rules using if … then …

Lets assume you would like to apply a mapping rule only in case a certain condition is true. For example only in case a field in the source data exists or contains a specific value. For this you can define if .. then .. statements as prefix for each rule. For example:

body: {
        "firstName": "Max  ",
        "lastName": "smith",
        "age": 48,
        "birthDate": "01/12/1977",
        "hobbies": ["hiking", "biking"],
        "type": "customer"
     }

pipeline:
    - data.mapping:
        input: ${body}
        rules: |
            if age > 18 then age -> person.adult,
            ...

As you can see in this example, the condition expression body.age > 18 will be executed and only if it returns true, the field person.adult will be created and the value of field age will be set there. Otherwise, this mapping rule will be ignored, so there wont be any person.adult field created on the target.

Here is another example of a conditional mapping rule which checks whether a field on the source data exists and only in this case the mapping rule will be applied:

...
rules: |
  if ['someField'] then someField -> someTargetField

So the syntax of a conditional mapping rule is always:

if CONDITION then MAPPING_RULE

CONDITION is a Pipeline Expression. Only if this expression returns:

  • Not null

  • true

  • A number >= 0

then MAPPING_RULE will be executed. Otherwise, it will be ignored and no field on target object will be created.

Mapping a List (iterating JSON array)

Lets assume you have a JSON array as input and you would like to prepare and write each entry of this array of dynamic length to a target list. This can also be done with the data.mapping command by defining the path to the array in the input parameter and setting the parameter iterate to true.

Here is an example:

body: {"people": [
        {
            "firstName": "Max  ",
            "lastName": "Smith",
            "age": 48,
            "birthDate": "01/12/1977",
            "hobbies": ["hiking", "biking"],
            "type": "customer"
        },
        {
            "firstName": "Diana  ",
            "lastName": "Serkson",
            "age": 56,
            "birthDate": "01/12/1967",
            "hobbies": ["reading", "swimming"],
            "type": "employee"
        }
    ]}

pipeline:
    - data.mapping:
        input: ${body.people}
        iterate: true
        rules: |
            @text.upperCase(firstName)   -> person.firstName,
            lastName    -> person.surname,
            age         -> person.age,
            birthDate   -> person.dateOfBirth,
            hobbies     -> person.hobbies,
            type        -> person.type

As you can see, the input parameter points to the people array inside the JSON.

The parameter iterate is set to true in order to iterate over the given JSON array.

This initially creates a new array on the output and then creates a new item entry on each iteration step in this list where the target fields will be automatically created if not exists (also recursively).

In case the output location is given then the final list will be written to this location. If not defined, the list will be written to the message body by default.

The output in the body from the example above will be this:

[
    {
        "person": {
            "firstName": "MAX  ",
            "surname": "Smith",
            "age": 48,
            "dateOfBirth": "01/12/1977",
            "hobbies": ["hiking", "biking"],
            "type": "customer"
        }
    },
    {
        "person": {
            "firstName": "DIANA  ",
            "surname": "Serkson",
            "age": 56,
            "dateOfBirth": "01/12/1967",
            "hobbies": ["reading", "swimming"],
            "type": "employee"
        }
    }
]

Hint: In some cases an iteration item should be ignored and not copied to target list in case it is an empty JSON. In such a situation, you can set the parameter ignoreEmptyItems to true.

Iteration Context

By default on any iteration loop, the current iteration item from the list will be provided and you can use it to apply mapping rules.

In case you need more information on each iteration loop and access to the pipeline scope, you can enable the iteration context using contextualize: true:

pipeline:
    - data.mapping:
        input: ${path.to.list}
        iterate: true
        contextualize: true
        rules: |
            ...

See here for a complete example:

vars:
  currentDate: ${@date.now()}
  
body: {"people": [
        {
            "firstName": "Max  ",
            "lastName": "Smith",
            "age": 48,
            "birthDate": "01/12/1977",
            "hobbies": ["hiking", "biking"],
            "type": "customer"
        },
        {
            "firstName": "Diana  ",
            "lastName": "Serkson",
            "age": 56,
            "birthDate": "01/12/1967",
            "hobbies": ["reading", "swimming"],
            "type": "employee"
        }
    ]}

pipeline:
    - data.mapping:
        input: ${body.people}
        iterate: true
        contextualize: true
        rules: |
            @text.upperCase(item.firstName)   -> person.firstName,
            item.lastName                     -> person.surname,
            item.age                          -> person.age,
            item.birthDate                    -> person.dateOfBirth,
            item.hobbies                      -> person.hobbies,
            item.type                         -> person.type,
            index                             -> person.index,
            vars.currentDate                  -> person.mappingDate

For more details about the contextualize feature, see here: https://logabit.atlassian.net/wiki/spaces/PA/pages/2548039682/#Contextualize-the-mapping-data .

Mapping via FTL Template

An alternative way for mapping data from a source JSON to a target JSON is by using a template engine like the FreeMarker of the command transform.ftl. See also here: Template Transformation .

Here is an example:

body: {
        "firstName": "Max  ",
        "lastName": "smith",
        "age": 48,
        "birthDate": "01/12/1977",
        "hobbies": ["hiking", "biking"],
        "type": "customer"
     }

pipeline:
    - transform.ftl:
        template: |
          {
            "person": {
                "firstName": "${body.firstName}",
                "surname": "${body.lastName}",
                "age": ${body.age},
                "dateOfBirth": "${body.birthDate}",
                "hobbies": [
                <#list body.hobbies as hobby>
                "${hobby}"<#sep>, </#sep>
                </#list>
                ],
                "type": "${body.type}"
            }
          }

Which gives you this output:

{
    "person": {
        "firstName": "Max  ",
        "surname": "smith",
        "ageas": 48,
        "dateOfBirth": "01/12/1977",
        "hobbies": [
            "hiking",
            "biking"
        ],
        "type": "customer"
    }
}

You also can map JSON arrays using this approach. For example:

body: [
    {
        "firstName": "Max  ",
        "lastName": "smith",
        "age": 48,
        "birthDate": "01/12/1977",
        "hobbies": ["hiking", "biking"],
        "type": "customer"
     },
     {
        "firstName": "Sam  ",
        "lastName": "Walsh",
        "age": 48,
        "birthDate": "01/12/1977",
        "hobbies": ["swimming", "bowling"],
        "type": "customer"
     }
     ]

pipeline:
    - transform.ftl:

        iterate: true
        # Use FreeMarker language here for mappings (PEL will not work here!)
        template: |
          [
          <#list body as item>
          {
            "person": {
                "firstName": "${item.firstName}",
                "surname": "${item.lastName?cap_first}",
                "age": ${item.age},
                "dateOfBirth": "${item.birthDate}",
                "hobbies": [
                <#list item.hobbies as hobby>
                "${hobby}"<#sep>, </#sep>
                </#list>
                ],
                "type": "${item.type}"
            }
          }<#sep>, </#sep>
          </#list>
          ]

Which gives you an output like this:

[
    {
        "person": {
            "firstName": "Max  ",
            "surname": "Smith",
            "age": 48,
            "dateOfBirth": "01/12/1977",
            "hobbies": [
                "hiking",
                "biking"
            ],
            "type": "customer"
        }
    },
    {
        "person": {
            "firstName": "Sam  ",
            "surname": "Walsh",
            "age": 48,
            "dateOfBirth": "01/12/1977",
            "hobbies": [
                "swimming",
                "bowling"
            ],
            "type": "customer"
        }
    }
]

Using a template engine for JSON mapping is not as clean as it is by using the data.mapping command but has huge flexibility.

Keep in mind that the FreeMarker template engine also uses the ${ } syntax to access its model data. Do not confuse with Pipeline Expressions: Inside the template parameter you can only use FreeMarker expression, no Pipeline Expressions!

For the mapping functions (like date format, text manipulation and so on) you need to use the built-in functions of the FreeMarker template engine. See the ?cap_first usage in the surname attribute in the example.

For FreeMarker, you can find the reference for the built-in functions here: https://freemarker.apache.org/docs/ref_builtins.html

  • No labels

0 Comments

You are not logged in. Any changes you make will be marked as anonymous. You may want to Log In if you already have an account.