The transformation builder can handle arrays in a number of ways. You can copy an array, or explode the contents, choosing to include or exclude null values.
When you add an array to a schema, choose from the following array handling options.
Copy
The Copy option generates a single row in the output for each row in the input.
Explode
The "Explode" option results in multiple rows in the output data set for each row in the input. Each non-null value in an array generates a separate row in the output. For example, using the default behavior on the following input generates two unique output rows with the same name values and different phone details.
{
"givenName": {
"first": "Michael",
"last": "Smith"
},
"preferredName": {
"first": "Mike",
"last": "Smith"
},
"phone": [
{
"type": "home",
"number": "212 555-1234"
},
{ "type": "work",
"number": "646 555-4567"
}
}
}
Explode Outer
By default, the built schema uses the "Explode Outer" option for the Output Transform Action on any arrays in your JSON object. Explode Outer behaves like Explode, but it also generates rows in the even of Null values. Using the above JSON, for example, if the value of the work phone number was Null - that is, if an entry existed for "type": "work"
, but the entry for number contained no value or was missing - an output row would still be created, containing a Null value.
Pos Explode
Pos Explode is simlar to Explode. Additionally, you can specify a related field. The related field must have the same Source Data Type and Output Transform Action. The data is considered to be related by its position in each array. For example, you could start with the following simple JSON:
{"field": ["first","second"], "relatedField": ["one","two"]
If you handle each array by using Explode or Explode Outer, the generated output contains four rows, one for each combination of the two values in each array.
field | relatedField |
---|---|
first | one |
first | two |
second | two |
second | one |
If you select the Pos Explode option for each array, and specify "relatedField" as the Related Field of "field", the output contains two rows:
field | relatedField |
---|---|
first | one |
second | two |
Pos Explode Outer
Pos Explode Outer behaves the same as Pos Explode, but also includes Null values in its output.