Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 14 Next »

What is XML - JSON Transformation?

Conversion from XML to JSON is not so straight forwards as it seems to, since there are some special cases which must be treated well in the conversion step. For example:

On the other hand:

  • JSON differentiates between objects, arrays with primitives and arrays with objects, XML has no concept of "arrays"

  • JSON allows an array or an object with multiple entries to be the root element, XML allows only a single element to be the root

  • JSON supports data types, XML not (by default all is string)

Transform XML → JSON

Having said that, there is no "default" way of converting from XML to JSON and back, since all libraries available support the differences mentioned above only partially and/or handle them differently.

Default Conversion Scheme

Therefore, PIPEFORCE has defined a system wide "default" how to convert from XML to JSON and back in order to support most concepts from both worlds.

This will be the case if no scheme parameter is given to the transform.xml.json command or if it set null. For example:

pipeline:
  - transform.xml.json:

In the table below you can see what is supported in the default conversion scheme:

Feature

XML to JSON

JSON to XML

XML elements

yes

yes

XML attributes

yes

yes

XML mixed content

yes

yes

XML processing instructions

yes

yes

XML namespaces in elements

yes

yes

XML namespaces in attributes

yes

yes

XML CDATA

no

no

XML comments

no

no

JSON objects

n/a

yes

JSON array of objects

n/a

yes

JSON array of primitives

n/a

yes

JSON data types

n/a

no

This table explains whether it is possible to keep information on transformation between the two formats when using the PIPEFORCE default format.

XML elements

An XML element like this:

<root/>

will by default be converted to a JSON structure like this:

{
  "root:" {
    "attributes":[],
    "children":[]
  }
}

Even if there are no attributes and no children, those entries must exist in the JSON document with an empty array declaration. The null value is not allowed here.

This is in order to make it easier for later processing and to automatically detect, whether it is a default PIPEFORCE format.

Furthermore there is only one JSON object allowed in the first level, similar to XML.

If there are nested XML elements, they will be placed inside the children section. For example, this:

<person>
  <firstName/>
</person>

will be converted to this JSON with nested elements:

{
  "person:" {
    "attributes":[],
    "children":[
      "firstName:" {
        "attributes":[],
        "children":[]
      }
    ]
  }
}

XML attributes

An XML element with an attribute like this:

<person age="23"/>

will be converted to a JSON structure like this:

{
  "person:" {
    "attributes":[ {"age": "23"} ],
    "children":[]
  }
}

Text content

An XML element with text content like this:

<person>
  <firstName>Max</firstName>
</person>

will be converted to this JSON structure:

{
  "person:" {
    "attributes":[],
    "children":[
      "firstName:" {
        "attributes":[],
        "children":["Max"]
      }
    ]
  }
}

As you can see, the children array can contain both: text content and elements.

Mixed content

In XML it is possible to mix XML elements with text content which could look like this:

<text>This is a <b>bold</b> formatted word.</text>

This will be converted to JSON like this:

{
  "text": {
    "attributes": {},
    "children": [
      "This is a ",
      {
        "b": {
          "attributes": {},
          "children": ["bold"]
        }
      },
      " formatted word."
    ]
  }
}

Namespaces

In XML there is the concept of namespaces. This allows to extend XML structures by other, custom structures.

An XML document with a custom namespace could be look like this:

<foo:person xmlns:foo="http://some.ns">
  <foo:firstName foo:age="23"/>
</foo:person>

As you can see, all elements and attributes are bound to the namespace here using the prefix foo.

If you convert this with the default XML to JSON transformation rules, you will get a JSON like this:

{
  "foo:person": {
    "attributes": {
      "xmlns:foo": "http://some.ns"
    },
    "children": [
      {
        "foo:firstName": {
          "attributes": {
            "foo:age": "23"
          },
          "children": []
        }
      }
    ]
  }
}

Processing instructions

XML documents can contain processing instructions in the prologue like this:

<?someInstruction someParams?>
<root/>

This will be converted to a JSON like this:

{
  "processing-instructions": {
    "someInstruction": "someParams"
  },
  "root": {
    "attributes": {},
    "children": []
  }
}

Jackson Conversion Scheme

SINCE VERSION 9.0

As an alternative to the Default Conversion scheme, you can use the conversion scheme type jackson:

pipeline:
  - transform.xml.json:
      scheme: jackson

This will use the Jackson XML Mapping scheme to convert a given XML to JSON.

The jackson conversion scheme is described in this table:

XML

JSON

Access with PEL (Examples)

Single root element:

<root/>

Invalid (single root element is not allowed by default by Jackson)

Root element with text:

<root>Some text</root>

Invalid (single root element with text is not allowed by default by Jackson)

Nested element:

<root>
  <orders/>
</root>

{
  "orders": ""
}

orders   
-> "" 

Nested element with text:

<root>
 <orders>Some text</orders>
</root>

{
  "orders": "Some text"
}

orders   
-> "Some text" 

Deeper nested element with text:

<root>
 <orders>
  <item>My Item</item>
 </orders>
</root>

{
  "orders": {
    "item": "My Item"
  }
}

orders.item
-> "My Item"   

Multiple deep elements of same name:

<root>
 <orders>
  <item>My Item1</item>
  <item>My Item2</item>
 </orders>
</root>

{
  "orders": {
    "item": [
      "My Item1",
      "My Item2"
    ]
  }
}

orders.item[0]
-> "My Item1"
orders['item'][1]
-> "My Item2"
orders.item
-> [
    "My Item1", 
    "My Item2"
   ]

Multiple elements of different name:

<root>
 <orders>
  <itemA>Item1</itemA>
  <itemB>Item2</itemB>
 </orders>
</root>

{
  "orders": {
    "itemA": "Item1",
    "itemB": "Item2"
  }
}

orders.itemA
-> "Item1"
orders['itemB']
-> "Item2"
orders
-> {
"itemA": "Item1",
"itemB": "Item2"
}

XML attribute and text mixed:

<root>
 <orders>
  <item id="0">Text</item>
 </orders>
</root>

{
  "orders": {
    "item": {
      "id": "0",
      "": "Text"
    }
  }
}

orders.item.id
-> "0"
orders.item['']
-> "Text"

Mixed content (text + element):

<root>
  text
  <a>Foo</a>
</root>

{
  "": "\n text\n ",
  "a": "Foo"
}

['']
-> "\n text\n"
a
-> "Foo"

Transform JSON → XML

With Command transform.json.xml

If the JSON document is in the default XML-JSON transformation format (contains attributes and children elements), then it will be transformed to XML using the default transformation rules, explained above.

These rules are implemented by the transform.json.xml command.

Otherwise, in case the starting point is a custom JSON document which doesn't comply with the default XML-JSON transformation rules, the conversion will be differently and the rules as explained below will be auto-applied for the conversion JSON → XML.

A JSON document like this:

{
  "person": {
    "firstName": "Max"
  }
}

will be converted to this XML:

<root>
  <person>
    <firstName>Max</firstName>
  </person>
</root>

A JSON document with an array in it, could look like this:

{
  "person": {
    "firstName": "Max",
    "hobbies": ["Reading", "Binking", "Swiming"]
  }
}

This will be converted to an XML structure like this:

<root>
  <person>
    <firstName>Max</firstName>
    <hobbies>Reading</hobbies>
    <hobbies>Binking</hobbies>
    <hobbies>Swiming</hobbies>
  </person>
</root>

With FTL template

Another approach to transform a given JSON to an XML is to use the transform.ftl command which converts from any JSON structure to any XML structure by using the FreeMarker template language.

This approach is very flexible since any input JSON can be converted to any concrete output XML structure.

For more information about this approach, see Template Transformation.

Here is an example:

# The source JSON document
body: {
        "rows": [
            {
                "firstName": "Max",
                "lastName": "Smith",
                "age": "38"
            },
            {
                "firstName": "Susann",
                "lastName": "Mayr Wan",
                "age": "44"
            }
        ]
    }

pipeline:
    - transform.ftl:
        # These are the conversion rules from JSON -> XML
        template: |
            <root>
            <#list body.rows as person>
                <person>
                    <firstName>${person.firstName}</fistName>
                    <lastName>${person.lastName}</lastName>
                    <age>${person.age}</age>
                </person>
            </#list>
            </root>

This will result in an XML output like this:

<root>
    <person>
        <firstName>Max</fistName>
        <lastName>Smith</lastName>
        <age>38</age>
    </person>
    <person>
        <firstName>Susann</fistName>
        <lastName>Mayr Wan</lastName>
        <age>44</age>
    </person>
</root>
  • No labels