What is XML - JSON Transformation?
Conversion from XML to JSON is not so straight forwards as it seems to, since there are some special cases which must be treated well in the conversion step. For example:
XML supports mixed content, JSON not
XML has processing instructions, JSON not
XML has namespaces, JSON not
XML has attributes, JSON not
XML has CDATA, JSON not
XML allows comments, JSON not
On the other hand:
JSON differentiates between objects, arrays with primitives and arrays with objects, XML has no concept of "arrays"
JSON allows an array or an object with multiple entries to be the root element, XML allows only a single element to be the root
JSON supports data types, XML not (by default all is string)
Transform XML → JSON
Having said that, there is no "default" way of converting from XML to JSON and back, since all libraries available support the differences mentioned above only partially and/or handle them differently.
Default Conversion Scheme
Therefore, PIPEFORCE has defined a system wide "default" how to convert from XML to JSON and back in order to support most concepts from both worlds.
This will be the case if no scheme parameter is given to the transform.xml.json command or if it set null
. For example:
pipeline: - transform.xml.json:
In the table below you can see what is supported in the default conversion scheme:
Feature | XML to JSON | JSON to XML |
---|---|---|
XML elements | yes | yes |
XML attributes | yes | yes |
XML mixed content | yes | yes |
XML processing instructions | yes | yes |
XML namespaces in elements | yes | yes |
XML namespaces in attributes | yes | yes |
XML CDATA | no | no |
XML comments | no | no |
JSON objects | n/a | yes |
JSON array of objects | n/a | yes |
JSON array of primitives | n/a | yes |
JSON data types | n/a | no |
This table explains whether it is possible to keep information on transformation between the two formats when using the PIPEFORCE default format.
XML elements
An XML element like this:
<root/>
will by default be converted to a JSON structure like this:
{ "root:" { "attributes":[], "children":[] } }
Even if there are no attributes
and no children
, those entries must exist in the JSON document with an empty array declaration. The null
value is not allowed here.
This is in order to make it easier for later processing and to automatically detect, whether it is a default PIPEFORCE format.
Furthermore there is only one JSON object allowed in the first level, similar to XML.
If there are nested XML elements, they will be placed inside the children
section. For example, this:
<person> <firstName/> </person>
will be converted to this JSON with nested elements:
{ "person:" { "attributes":[], "children":[ "firstName:" { "attributes":[], "children":[] } ] } }
XML attributes
An XML element with an attribute like this:
<person age="23"/>
will be converted to a JSON structure like this:
{ "person:" { "attributes":[ {"age": "23"} ], "children":[] } }
Text content
An XML element with text content like this:
<person> <firstName>Max</firstName> </person>
will be converted to this JSON structure:
{ "person:" { "attributes":[], "children":[ "firstName:" { "attributes":[], "children":["Max"] } ] } }
As you can see, the children
array can contain both: text content and elements.
Mixed content
In XML it is possible to mix XML elements with text content which could look like this:
<text>This is a <b>bold</b> formatted word.</text>
This will be converted to JSON like this:
{ "text": { "attributes": {}, "children": [ "This is a ", { "b": { "attributes": {}, "children": ["bold"] } }, " formatted word." ] } }
Namespaces
In XML there is the concept of namespaces. This allows to extend XML structures by other, custom structures.
An XML document with a custom namespace could be look like this:
<foo:person xmlns:foo="http://some.ns"> <foo:firstName foo:age="23"/> </foo:person>
As you can see, all elements and attributes are bound to the namespace here using the prefix foo
.
If you convert this with the default XML to JSON transformation rules, you will get a JSON like this:
{ "foo:person": { "attributes": { "xmlns:foo": "http://some.ns" }, "children": [ { "foo:firstName": { "attributes": { "foo:age": "23" }, "children": [] } } ] } }
Processing instructions
XML documents can contain processing instructions in the prologue like this:
<?someInstruction someParams?> <root/>
This will be converted to a JSON like this:
{ "processing-instructions": { "someInstruction": "someParams" }, "root": { "attributes": {}, "children": [] } }
Jackson Conversion Scheme
SINCE VERSION 9.0
As an alternative to the Default Conversion scheme, you can use the conversion scheme
type jackson
:
pipeline: - transform.xml.json: scheme: jackson
This will use the Jackson XML Mapping scheme to convert a given XML to JSON. It creates a much easier to use JSON structure but doesn’t support all XML features as the default conversion scheme does. Also conversion back from JSON to XML is not straight forward. But for one-way conversion XML → JSON with simple structures it is a good choice.
The jackson
conversion scheme is described in this table:
XML | JSON | Access with PEL (Examples) |
---|---|---|
Single root element: <root/> | Invalid (single root element is not allowed by default by Jackson) | |
Root element with text: <root>Some text</root> | Invalid (single root element with text is not allowed by default by Jackson) | |
Nested element: <root> <orders/> </root> | { "orders": "" } | orders -> "" |
Nested element with text: <root> <orders>Some text</orders> </root> | { "orders": "Some text" } | orders -> "Some text" |
Deeper nested element with text: <root> <orders> <item>My Item</item> </orders> </root> | { "orders": { "item": "My Item" } } | orders.item -> "My Item" |
Multiple deep elements of same name: <root> <orders> <item>My Item1</item> <item>My Item2</item> </orders> </root> | { "orders": { "item": [ "My Item1", "My Item2" ] } } | orders.item[0] -> "My Item1" orders['item'][1] -> "My Item2" orders.item -> [ "My Item1", "My Item2" ] |
Multiple elements of different name: <root> <orders> <itemA>Item1</itemA> <itemB>Item2</itemB> </orders> </root> | { "orders": { "itemA": "Item1", "itemB": "Item2" } } | orders.itemA -> "Item1" orders['itemB'] -> "Item2" orders -> { "itemA": "Item1", "itemB": "Item2" } |
XML attribute and text mixed: <root> <orders> <item id="0">Text</item> </orders> </root> | { "orders": { "item": { "id": "0", "": "Text" } } } | orders.item.id -> "0" orders.item[''] -> "Text" |
Mixed content (text + element): <root> text <a>Foo</a> </root> | { "": "\n text\n ", "a": "Foo" } | [''] -> "\n text\n" a -> "Foo" |
Text with line breaks and CDATA: <root> <someElem> <![CDATA[Some <b>text</b>]]> </someElem> </root> | { "someElem": "\n Some <b>text</b>\n " } | someElem -> "\n Some <b>text</b>\n " |
Transform JSON → XML
With Command transform.json.xml
If the JSON document is in the default XML-JSON transformation format (contains attributes
and children
elements), then it will be transformed to XML using the default transformation rules, explained above.
These rules are implemented by the transform.json.xml command.
Otherwise, in case the starting point is a custom JSON document which doesn't comply with the default XML-JSON transformation rules, the conversion will be differently and the rules as explained below will be auto-applied for the conversion JSON → XML.
A JSON document like this:
{ "person": { "firstName": "Max" } }
will be converted to this XML:
<root> <person> <firstName>Max</firstName> </person> </root>
A JSON document with an array in it, could look like this:
{ "person": { "firstName": "Max", "hobbies": ["Reading", "Binking", "Swiming"] } }
This will be converted to an XML structure like this:
<root> <person> <firstName>Max</firstName> <hobbies>Reading</hobbies> <hobbies>Binking</hobbies> <hobbies>Swiming</hobbies> </person> </root>
With FTL template
Another approach to transform a given JSON to an XML is to use the transform.ftl command which converts from any JSON structure to any XML structure by using the FreeMarker template language.
This approach is very flexible since any input JSON can be converted to any concrete output XML structure.
For more information about this approach, see Template Transformation.
Here is an example:
# The source JSON document body: { "rows": [ { "firstName": "Max", "lastName": "Smith", "age": "38" }, { "firstName": "Susann", "lastName": "Mayr Wan", "age": "44" } ] } pipeline: - transform.ftl: # These are the conversion rules from JSON -> XML template: | <root> <#list body.rows as person> <person> <firstName>${person.firstName}</fistName> <lastName>${person.lastName}</lastName> <age>${person.age}</age> </person> </#list> </root>
This will result in an XML output like this:
<root> <person> <firstName>Max</fistName> <lastName>Smith</lastName> <age>38</age> </person> <person> <firstName>Susann</fistName> <lastName>Mayr Wan</lastName> <age>44</age> </person> </root>
0 Comments