What is JSON Data Mapping?
Data Mapping in PIPEFORCE means you have two JSON documents: The source and the target and you would like to write fields from the source JSON to the target JSON. This is called mapping since you map from source fields to target fields. For each such field mapping you can define a rule which defines how to map and optionally contains additional instructions in order to prepare the to be mapped data before it gets written to its target location. For example validate it or convert the text to upper case.
Mapping with command data.mapping
To simplify the mapping, the command data.mapping can be used.
It takes a list of mapping rules and applies them on the source JSON (= input) in order to write data to the target JSON (= output). Also additional expressions can be added to each rule. These expressions are Pipeline Expressions and Pipeline Expression Utils.
Let's see an example first:
body: { "firstName": "Max ", "lastName": "smith", "age": 48, "birthDate": "01/12/1977", "hobbies": ["hiking", "biking"], "type": "customer" } pipeline: - data.mapping: rules: | body.firstName -> person.firstName, body.lastName -> person.surname, body.age -> person.age, body.birthDate -> person.dateOfBirth, body.hobbies -> person.hobbies, body.type -> person.type
This example sets a JSON document in the initial body at the top. Note: This JSON can be loaded from any location instead.
Then it applies the given mapping rules from left to right, and writes by default the result as a new JSON in the body (replacing the initial JSON).
As you can see, every mapping rule is placed in a separate line, each ending with a comma, except for the last one.
The left part of the mapping rule (left side of the arrow) is the input path (where to read the data from). The right part of the mapping rule (right side of the arrow) is the output path (where to write the data to):
inputPath -> outputPath
All mapping rules in inputPath
are relative to the context given by input
parameter. By default, this value is the current pipeline message which contains vars
, headers
and body
as root items. So in your mapping, you can access all of them. For example to map from a pipeline variable, you could write a mapping rule like this:vars.myVariable -> target
The output
parameter points to the location, where the mapping results should be written to. This is by default the body of the pipeline.
The final mapping result in the body from the example above will look like this:
{ "person": { "firstName": "Max ", "surname": "smith", "age": 48, "dateOfBirth": "01/12/1977", "hobbies": [ "hiking", "biking" ], "type": "customer" } }
As you can see, the applied mapping rules resulted in these changes:
The input field
firstName
was nested inside the new elementperson
. The field namefirstName
was not changed.The input field
lastName
was also mapped to the nested elementperson
. Additionally it was renamed fromfirstName
tosurname
.The field
age
was nested insideperson
without any change.And the input field
birthDate
was nested insideperson
and renamed todateOfBirth
.The field
type
was only nested insideperson
.
Using advanced Pipeline Expression in mapping rules
Now lets assume we would like to add some functions to the values in parallel to the mapping. You can do so by applying Pipeline Expressions on the input path. For example:
body: { "firstName": "Max ", "lastName": "smith", "age": 48, "birthDate": "01/12/1977", "hobbies": ["hiking", "biking"], "type": "customer" } pipeline: - data.mapping: rules: | @text.trim(body.firstName) -> person.firstName, @text.firstCharUpper(body.lastName) -> person.surname, body.age -> person.age, body.birthDate -> person.dateOfBirth, body.hobbies[0] -> person.primaryHobby, @text.upperCase(body.type) -> person.type, body.age > 18 -> person.adult, "male" -> person.gender, @data.emptyList() -> person.myList, @data.emptyObject() -> person.myObject
As you can see, there is no need to wrap each expression inside ${
and }
since each rule part in the data mapping is automatically treated as a PE. This makes it much easier to read the mapping rules.
The result JSON of this pipeline after execution will look like this:
{ "person": { "firstName": "Max", "surname": "Smith", "age": 48, "dateOfBirth": "01/12/1977", "primaryHobby": "hiking", "type": "CUSTOMER", "adult": true, "gender": "male", "myList": [], "myObject": {} } }
As you can see, the nested mapping below person
was kept. Additionally:
the field
firstName
was trimmed from whitespacesthe field
surname
contain now the first char upper casethe first item of the array
hobbies
was selected and set to new elementperson.primaryHobby
.the field
type
was converted to upper case and a new fieldperson.adult
was added with the result of the expressionage > 18
the constant string
male
was set to the new fieldperson.gender
a new, empty list was added in new field
person.myList
a new, empty object was added in new field
person.myObject
.
By default the mapping result gets written to the body. If you would like write to a variable instead, you can use the output
parameter:
vars: mappingResult: null pipeline: - data.mapping: rules: | ... output: ${vars.mappingResult}
Make sure that the output target was created before.
Contextualize the mapping data
By default the input data is provided directly to each mapping rule so it can be very easily accessed like this example shows:
vars: data: {"name": "someAttr", "value": "someValue"} pipeline: - data.mapping: input: ${vars.data} rules: | name -> result.name, value -> result.value
As you can see, on each mapping rule you can directly access the attributes of the input JSON via name
and value
variable.
In some situations you need more information on each mapping rule like the result of an inputPath, the current iteration index on case of an iteration and more. In this case, you can set the parameter contextualize
to true
.
If contextualize
is enabled, an mapping context is provided on each mapping rule which gives you read access to these additional attributes:
vars
= The variables of the pipeline.headers
= The headers of the pipeline.body
= The current body of the pipeline.item
= The current iteration item if this is an iteration mapping. Otherwise returns the current input data.index
= The current (0 based) iteration index if this is an iteration mapping. Otherwise returns-1
.selection
= The current selection result after theinputPath
expression has been evaluated (= the value from the input to be written to the output). Note: This is only available inoutputPath
(right side).
Note: If contextualize
is enabled, the current iteration item will be replaced by the context object. In order to still access the current iteration now you have to use the item.
prefix instead. See the example below.
vars: data: {"name": "someAttr", "value": "someValue"} pipeline: - data.mapping: input: ${vars.data} contextualize: true rules: | item.name -> result.name, item.value -> result.value
Dynamic Output Path
The outputPath
(right side of the mapping rule) is usually a fixed path to describe where to write the calculated selection from the inputPath rule to.
Sometimes it is required to create this outputPath
dynamically based on the input data.
To do so, you have to set contextualize: true
. This will provide you a mapping context to each mapping rule which contains the current selection. This can be used to create a dynamic output path instead. See this example:
vars: data: {"name": "someAttr", "value": "someValue"} pipeline: - data.mapping: contextualize: true input: ${vars.data} rules: | item.value -> ${item.name}
As you can see:
Any input data is now accessed using the prefix
item.
This is because if contextualize is set to true, the mapping context is provided which adds multiple tooling variables in order to simplify mapping. Each variable has its own “namespace” to not collide with the input data.The outputPath now uses a PEL in order to access the mapping context and dynamically set the name for the output attribute.
The final JSON result is this:
{ "someAttr": "someValue" }
Conditional mapping rules using if … then …
Lets assume you would like to apply a mapping rule only in case a certain condition is true. For example only in case a field in the source data exists or contains a specific value. For this you can define if .. then ..
statements as prefix for each rule. For example:
body: { "firstName": "Max ", "lastName": "smith", "age": 48, "birthDate": "01/12/1977", "hobbies": ["hiking", "biking"], "type": "customer" } pipeline: - data.mapping: input: ${body} rules: | if age > 18 then age -> person.adult, ...
As you can see in this example, the condition expression body.age > 18
will be executed and only if it returns true
, the field person.adult
will be created and the value of field age
will be set there. Otherwise, this mapping rule will be ignored, so there wont be any person.adult
field created on the target.
Here is another example of a conditional mapping rule which checks whether a field on the source data exists and only in this case the mapping rule will be applied:
... rules: | if ['someField'] then someField -> someTargetField
So the syntax of a conditional mapping rule is always:
if CONDITION then MAPPING_RULE
CONDITION
is a Pipeline Expression. Only if this expression returns:
Not
null
true
A number >=
0
then MAPPING_RULE
will be executed. Otherwise, it will be ignored and no field on target object will be created.
Mapping a List (iterating JSON array)
Lets assume you have a JSON array as input and you would like to prepare and write each entry of this array of dynamic length to a target list. This can also be done with the data.mapping
command by defining the path to the array in the input
parameter and setting the parameter iterate
to true
.
Here is an example:
body: {"people": [ { "firstName": "Max ", "lastName": "Smith", "age": 48, "birthDate": "01/12/1977", "hobbies": ["hiking", "biking"], "type": "customer" }, { "firstName": "Diana ", "lastName": "Serkson", "age": 56, "birthDate": "01/12/1967", "hobbies": ["reading", "swimming"], "type": "employee" } ]} pipeline: - data.mapping: input: ${body.people} iterate: true rules: | @text.upperCase(firstName) -> person.firstName, lastName -> person.surname, age -> person.age, birthDate -> person.dateOfBirth, hobbies -> person.hobbies, type -> person.type
As you can see, the input
parameter points to the people
array inside the JSON.
The parameter iterate
is set to true
in order to iterate over the given JSON array.
This initially creates a new array on the output and then creates a new item entry on each iteration step in this list where the target fields will be automatically created if not exists (also recursively).
In case the output location is given then the final list will be written to this location. If not defined, the list will be written to the message body by default.
The output in the body from the example above will be this:
[ { "person": { "firstName": "MAX ", "surname": "Smith", "age": 48, "dateOfBirth": "01/12/1977", "hobbies": ["hiking", "biking"], "type": "customer" } }, { "person": { "firstName": "DIANA ", "surname": "Serkson", "age": 56, "dateOfBirth": "01/12/1967", "hobbies": ["reading", "swimming"], "type": "employee" } } ]
Hint: In some cases an iteration item should be ignored and not copied to target list in case it is an empty JSON. In such a situation, you can set the parameter ignoreEmptyItems
to true
.
Iteration Context
By default on any iteration loop, the current iteration item from the list will be provided and you can use it to apply mapping rules.
In case you need more information on each iteration loop and access to the pipeline scope, you can enable the iteration context using contextualize: true
:
pipeline: - data.mapping: input: ${path.to.list} iterate: true contextualize: true rules: | ...
See here for a complete example:
vars: currentDate: ${@date.now()} body: {"people": [ { "firstName": "Max ", "lastName": "Smith", "age": 48, "birthDate": "01/12/1977", "hobbies": ["hiking", "biking"], "type": "customer" }, { "firstName": "Diana ", "lastName": "Serkson", "age": 56, "birthDate": "01/12/1967", "hobbies": ["reading", "swimming"], "type": "employee" } ]} pipeline: - data.mapping: input: ${body.people} iterate: true contextualize: true rules: | @text.upperCase(item.firstName) -> person.firstName, item.lastName -> person.surname, item.age -> person.age, item.birthDate -> person.dateOfBirth, item.hobbies -> person.hobbies, item.type -> person.type, index -> person.index, vars.currentDate -> person.mappingDate
For more details about the contextualize
feature, see here: https://logabit.atlassian.net/wiki/spaces/PA/pages/2548039682/#Contextualize-the-mapping-data .
Mapping via FTL Template
An alternative way for mapping data from a source JSON to a target JSON is by using a template engine like the FreeMarker of the command transform.ftl
. See also here: Template Transformation .
Here is an example:
body: { "firstName": "Max ", "lastName": "smith", "age": 48, "birthDate": "01/12/1977", "hobbies": ["hiking", "biking"], "type": "customer" } pipeline: - transform.ftl: template: | { "person": { "firstName": "${body.firstName}", "surname": "${body.lastName}", "age": ${body.age}, "dateOfBirth": "${body.birthDate}", "hobbies": [ <#list body.hobbies as hobby> "${hobby}"<#sep>, </#sep> </#list> ], "type": "${body.type}" } }
Which gives you this output:
{ "person": { "firstName": "Max ", "surname": "smith", "ageas": 48, "dateOfBirth": "01/12/1977", "hobbies": [ "hiking", "biking" ], "type": "customer" } }
You also can map JSON arrays using this approach. For example:
body: [ { "firstName": "Max ", "lastName": "smith", "age": 48, "birthDate": "01/12/1977", "hobbies": ["hiking", "biking"], "type": "customer" }, { "firstName": "Sam ", "lastName": "Walsh", "age": 48, "birthDate": "01/12/1977", "hobbies": ["swimming", "bowling"], "type": "customer" } ] pipeline: - transform.ftl: iterate: true # Use FreeMarker language here for mappings (PEL will not work here!) template: | [ <#list body as item> { "person": { "firstName": "${item.firstName}", "surname": "${item.lastName?cap_first}", "age": ${item.age}, "dateOfBirth": "${item.birthDate}", "hobbies": [ <#list item.hobbies as hobby> "${hobby}"<#sep>, </#sep> </#list> ], "type": "${item.type}" } }<#sep>, </#sep> </#list> ]
Which gives you an output like this:
[ { "person": { "firstName": "Max ", "surname": "Smith", "age": 48, "dateOfBirth": "01/12/1977", "hobbies": [ "hiking", "biking" ], "type": "customer" } }, { "person": { "firstName": "Sam ", "surname": "Walsh", "age": 48, "dateOfBirth": "01/12/1977", "hobbies": [ "swimming", "bowling" ], "type": "customer" } } ]
Using a template engine for JSON mapping is not as clean as it is by using the data.mapping
command but has huge flexibility.
Keep in mind that the FreeMarker template engine also uses the ${ }
syntax to access its model data. Do not confuse with Pipeline Expressions: Inside the template
parameter you can only use FreeMarker expression, no Pipeline Expressions!
For the mapping functions (like date format, text manipulation and so on) you need to use the built-in functions of the FreeMarker template engine. See the ?cap_first
usage in the surname attribute in the example.
For FreeMarker, you can find the reference for the built-in functions here: https://freemarker.apache.org/docs/ref_builtins.html
Add Comment