Pivot - Data To Names - Data360_Analyze - Latest

Data360 Analyze Server Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 Analyze
Version
Latest
Language
English
Product name
Data360 Analyze
Title
Data360 Analyze Server Help
Copyright
2024
First publish date
2016
Last updated
2024-11-28
Published on
2024-11-28T15:26:57.181000

Converts input data (record data rather than the column headers) into field names in the output.

For each row in the input data, the contents of the field specified in the NamesField property and the DataField property will be used together to produce a row in the output data, whereby the contents of the field specified in the NamesField property identify the output field name under which the contents of the field specified in the DataField property are placed.

  1. In the NamesField property, specify the input field that contains the names that you want to use to construct additional columns in the output.

    By default, an error will occur if the transformed records have duplicate or missing attribute names, missing or additional attributes, or invalid characters in the transposed attribute names. You can modify this behavior by setting the OnDuplicate, NullValueBehavior and SubstituteInvalidChars properties, as required.

  2. In the DataField property, specify the input field that contains the data that you want to place under the output columns identified by the NamesField property.
  3. In the GroupBy property specify the input fields on which to aggregate the input records. This means that, within the specified field, any rows that have the same value are collated into one row in the output.
    Note: If you do not specify any value in the GroupBy property, the input data is considered to be a single group which can significantly impact the performance of the node. It is therefore recommended that you specify a value in this property when dealing with medium or large data sets. Also note that the specified input field must not contain floating point data, and fields with a data type of "double" should not be used to group data.

Example

You have the following input data set that you want to transpose so that the data in the "Field" column becomes the field names in the output, under which the corresponding data in the 'Value' column will be listed. In the input data set, there are two rows per record, indicated by the 'ID' column:

IDlong Fieldstring Valueunicode
1 CustomerName Martine Rance
1 TotalProductsPurchased 50
2 CustomerName Miguel Angel
2 TotalProductsPurchased 14
3 CustomerName Art Braunschweiger
3 TotalProductsPurchased 16

In the NamesField property, identify the input field that contains the data that is to become the new column names in the output by entering the following:

Field

In the DataField property, identify the input field that contains the attribute values that is to be listed under the new column names by entering the following:

Value

In the GroupBy property, as there are two rows per "ID", indicate that you want to collate each set of two rows into a single row in the output by entering the following:

ID

The output is as follows:

IDlong CustomerNameunicode TotalProductsPurchasedunicode
1 Martine Rance 50
2 Miguel Angel 14
3 Art Braunschweiger 16

If you want to remove the "ID" column from the output, you can do this by using a Transform node.

Properties

NamesField

Specify the input field that contains the names that you want to use to construct additional columns in the output.

For example, suppose the input data contains three fields: color, fldName, fldData, and has three rows of data. Row one contains "red, name1, data1", row two contains "red, name2, data2" and row three contains "red, name3, data3". If this property specifies fldName, then the output data will contain the following fields: color, name1, name2 and name3.

DataField

Specify the input field that contains the data that you want to place under the output columns identified by the NamesField property.

For each row in the input data, the contents of the field specified in the NamesField property and the DataField property will be used together to produce a row in the output data, whereby the contents of the field specified in the NamesField property identify the output field name under which the contents of the field specified in the DataField property are placed.

GroupBy

Optionally specify the input fields on which to aggregate the input records; that is, within the specified field, any rows that have the same value are collated into one row in the output.

Note: If you do not specify any value in the GroupBy property, the input data is considered to be a single group which can significantly impact the performance of the node; it is therefore recommended that you specify a value in this property when dealing with medium or large data sets. Also note that the specified input field must not contain floating point data.

SortInput

Optionally specify whether the input will be sorted based on the value specified in the GroupBy property. If no value is specified in the GroupBy property, the SortInput property has no effect.

The default value is True.

NameSet

Optionally specify what to do with pivot groups that either have extra names or missing names. Choose from:

  • Exact - An error occurs if an expected name is missing or if there is an extra name.
  • Union - The cells of the missing names are filled with NULL.
  • Intersection - Only the fields that all groups share are included in the output.

The default value is Exact.

FieldDateFormat

If the NamesField is of Date type, then optionally specify what format the field name will be written as. The default is yyyy-MM-dd.

FieldTimeFormat

If the NamesField is of Time type, then optionally specify what format the field name will be written as. The default is HH_mm_ss.

FieldDateTimeFormat

If the NamesField is of DateTime type, then optionally specify what format the field name will be written as. The default is yyyy-MM-dd HH_mm_ss.

FieldDoubleFormat

If the NamesField is of a Double type, then optionally specify what format the field name will be written as. The default is ###,###.00.

RoundingMode

Optionally specify the method used for rounding floating point numbers to integral (long/int) output fields. Choose from:

  • Ceiling - Rounds towards positive infinity.
  • Down - Rounds towards zero.
  • Floor - Rounds towards negative infinity.
  • Half Down - Rounds towards "nearest neighbor" unless both neighbors are equidistant, in which case, rounds down.
  • Half Even - Rounds towards "nearest neighbor" unless both neighbors are equidistant, in which case, rounds towards the even neighbor. Also known as "Banker's Rounding" and statistically minimizes cumulative error.
  • Half Up - Rounds towards "nearest neighbor" unless both neighbors are equidistant, in which case, rounds up.
  • Up - Rounds away from zero.

This value is also used if the type of an input field is string based, but the contents of the string field have a floating point representation for example when writing an input string field with a value of "1.45" to an int type, the RoundingMode property will be used.

See the Oracle documentation, for example https://docs.oracle.com/javase/8/docs/api/java/math/RoundingMode.html for more information on the different RoundingMode options.

The default value is Half Even.

Locale

Optionally specify the locale to be used for parsing string fields into other types.

The Locale must be specified in the format <language code>[_<country code>[_<variant code>]]

For example, to set the locale to English, the Locale property should be set to en.

For US English, the Locale property should be set to en_US.

For Austrian German, the Locale property should be set to de_AT.

This is especially important in the parsing of strings representing floating point numbers, however does not preclude parsing of floating point numbers in an international computing format.

For example, with the Locale property set to de_DE for German-speaking Germany, the following input data "1.234,56", "543.111,01" will be parsed as 1234.56 and 543111.01 respectively. However, if the input data provided is of the form: "1234.56", "543111.01" the node will determine that these should still be parsed as 1234.56 and 543111.01 respectively.

The default value is the default Locale under which the server is running.

DuplicateFieldNameBehavior

Optionally specify what to do if there are duplicate field names in the field name specified in the NamesField property for a given GroupBy group. Choose from:

  • Error - The node fails if there is a duplicate value in the field specified in the NamesField property in a single group.
  • Output new Record - All duplicate rows are included in the output.
  • Replace - Only one row of data is output.

The default value is Error.

If this property is set to Error, the node fails when it encounters a duplicate NamesField value in a single group. At each duplicate instance, the pivot fields will be cleared and updated with the duplicate value. If this property is set to Replace, the new value simply replaces the previous value, and only one row of data will be produced.

NullValueBehavior

Optionally specify what to do if the data within a field that is to be pivoted to a name is NULL. Choose from:

  • Ignore - The NULL data is not included in the output.
  • Error - The node fails to run.
  • Output as Null - A new field name called "NULL" is created.

The default value is Error.

SubstituteInvalidChars

Optionally specify what to do if a field which is to be pivoted contains invalid metadata characters. For example, for BRD metadata, there are restricted characters. If set to true, the invalid characters will be substituted for acceptable BRD metadata characters. If set to false, the node will error if there are any invalid characters in a 'names' field.

The default value is False.

MaxFieldCount

Optionally specify the maximum number of output fields allowed.

Incorrect configuration of the node (for example attempting to pivot on a unique field like an account balance) can lead to very large output metadata. To prevent inadvertent mistakes which would result in extremely large output records which could in turn cause the node to run out of memory, the default value of this property is 256.

Inputs and outputs

Inputs: PivotData, multiple optional.

Outputs: PivotedData, multiple optional.