Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Below, there is an introduction to all data mapping and transformation toolings you can use in PIPEFORCE.

Data Size Classification

Before you select the right data mapping and transformation tool, you should always think about the expected input data first. Depending on its size, some tools could be better suitable than others. Here is a classification on data size which is very often used:

Class

Size

Description

Small

< 10 MB

Can be handled easily in memory.

Medium

< 100MB

Can be handled on a single server node, but needs persistence in most cases because it is too big to be processed in memory.

Large

<= Gigabytes

Requires special data management techniques and systems. Must be distributed across systems.

Very Large
(Big Data)

>= Terabytes

Also known as "Big Data", these datasets encompass volumes of data so large that they require special processing techniques on multiple highly scalable nodes. They usually range from terabytes to petabytes or more.

Note that the boundaries between these classes are sometimes fuzzy and it is not always obvious, which class applies.

Transformer Commands

A transformer command in PIPEFORCE is a command which transforms / converts data from one structure into another. For example:

...