Change Metadata - Data360_Analyze - Latest

Data360 Analyze Server Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 Analyze
Version
Latest
Language
English
Product name
Data360 Analyze
Title
Data360 Analyze Server Help
Copyright
2024
First publish date
2016
Last updated
2024-11-28
Published on
2024-11-28T15:26:57.181000

Change a field's name and data type within the source data. Allows the fields to be reordered, and fields to be duplicated.

Changed fields will be output in the order in which they appear in the configuration on the second input pin. Fields in the original input data that are not changed can optionally be output following the changed fields, using the PassThroughUnchanged property. If either the new name or the new type is Null, that field will not have its respective name or type changed.

The first input pin (the input data pin) contains data to be changed. The second input pin (the metadata changes pin) should contain at least 3 fields:

  • A field containing the original field names of the fields whose metadata is to be changed. Specify the name of the field that contains the original field names in the OldNameField property.
  • A field containing the new names for the original fields. Specify the name of the field that contains the new field names in the NewNameField property.
    Note: The field value must be of data type string. Unicode field types are not permitted and it is not recommended to use numeric fields, or string fields that contain only numeric characters.
  • A field containing the new data types of the fields. Specify the name of the field that contains the data types in the NewTypeField property.
    Note: The data type values should be Data360 Analyze data types (as specified in the header row of the data viewer). Note that these data types do not align with the data types used in Python-based nodes. The valid Data360 Analyze data types are: int, long, double, unicode, string, date, time, datetime and boolean (case sensitive).

The properties DateTimeFormatString, DateFormatString, and TimeFormatString can be used to set the string format that dates are parsed from. When setting a format, the following pattern letters can be used (all other characters from "A" to "Z" and from "a" to "z" are reserved):

G Era Designator (For example, "AD")

y Year (For example, 1996;96)

M Month in year (e.g. "July";"Jul";07)

w Week in year (e.g. 27)

W Week in month (e.g. 2)

D Day in year (e.g. 189)

d Day in month (e.g. 10)

F Day of week in month (e.g. 2)

E Day in week (e.g. "Tuesday";"Tue")

a AM/PM marker (e.g. "AM")

H Hour in day (0-23 e.g. 0)

k Hour in day (1-24 e.g. 24)

K Hour in AM/PM (0-11 e.g. 0)

h Hour in AM/PM (1.12 e.g. 12)

m Minute in hour (e.g. 30)

s Second in minute (e.g. 55)

S Millisecond (e.g. 978)

z General Time Zone (e.g. "Pacific Standard Time"; "PST"; "GMT-08:00")

Z RFC 822 Time Zone (e.g. "-0800")

Pattern letters are usually repeated, as their number determines the exact presentation. For example, the format string "yyyy" will parse a string such as "2012" as the year 2012, whereas the format "yy" will parse a string "12" as the year 2012.

Other examples of date formats:

"yyyy.MM.dd G 'at' HH:mm:ss z" 2001.07.04 AD at 12:08:56 PDT
"EEE, MMM d, ''yy" Wed, Jul 4, '01
"h:mm a" 12:08 PM
"hh 'o''clock' a, zzzz" 12 o'clock PM, Pacific Daylight Time
"K:mm a, z" 0:08 PM, PDT
"yyyyy.MMMMM.dd GGG hh:mm aaa" 02001.July.04 AD 12:08 PM
"EEE, d MMM yyyy HH:mm:ss Z" Wed, 4 Jul 2001 12:08:56 -0700
"yyMMddHHmmssZ" 010704120856-0700
"yyyy-MM-dd'T'HH:mm:ss.SSSZ" 2001-07-04T12:08:56.235-0700

Example - Prefixing column position number to field names

  1. Drag a Create Data node onto the canvas and connect it to a Get Metadata node.
  2. Connect the Get Metadata node to a Transform node.

    In the ConfigureFields property, enter:

    #map all of the fields over

    out1 += in1

    #output the FieldName field as NewFieldName

    out1.NewFieldName = in1.FieldName

    #rename the FieldType field to NewFieldType

    # and exclude the FieldType field from the output

    out1 -= in1.FieldType

    out1.NewFieldType = in1.FieldType

    In the ProcessRecords property, enter:

    #Copy all fields from input 'in1' to the corresponding output fields

    #in output 'out1'. Copies the fields which have been setup

    #in the mapping defined in the ConfigureFields property

    out1 += in1

    #add suffix of column number to all records in NewFieldName field

    out1.NewFieldName = in1.FieldName + "_" + str(int(node.execCount))

    The Transform node changes the field names by suffixing the column number after the name, e.g. "color" becomes "color_1".

  3. Drag a Change Metadata node onto the canvas, then connect the Create Data node to the first input pin and connect the Transform node to the second input pin.
  4. Run the data flow.

    The Change Metadata node outputs the data from the Create Data node, with the new column names containing the suffixed column numbers.

    Tip: Note that in this example, there is no need to configure the properties in the Change Metadata node because the default values are used as the input field names. If the input field names were different, you would need to enter the different values in the corresponding Change Metadata properties.

Properties

OldNameField

Specify the name of field to use when reading in old column names from the second input pin. The default value is FieldName.

NewNameField

Specify the name of field to use when reading in new column names from the second input pin. The default value is NewFieldName.

NewTypeField

Specify the name of field to use when reading in new column data types from the second input pin. The default value is NewFieldType.

ErrorThreshold

Optionally specify the number of transfer errors that will cause the node to give up and fail.

Each record on the input pin is a "request". A transfer error is any error that causes a request to fail (e.g. a requested file does not exist). Setting this property instructs the node to continue processing requests as long as the number of errors remains below the given threshold.

An ErrorThreshold of 0 means never fail on a transfer error (the node will still fail on more serious errors). The default value is 1 i.e. the node fails on the first error encountered.

PassThroughUnchanged

Optionally specify whether or not fields in the input data which have not had any metadata changes applied should appear in the output data. If this is set to true, such fields will appear after those that have had their metadata changed. If false, these fields are excluded. The default value is True.

EmptyStringAsNull

Optionally specify whether empty string values in the input data will be treated as NULL for the purposes of changing the metadata of the field. The default value is False.

DateTimeFormatString

Optionally specify the date format string to be used for changing data to datetime format. The characters to be used in writing the format string are described in the node help. The default value is "yyyy-MM-dd HH:mm:ss".

DateFormatString

Optionally specify the date format string to be used for changing data to date format. The characters to be used in writing the format string are described in the node help. The default value is "yyyy-MM-dd"

TimeFormatString

Optionally specify the date format string to be used for changing data to time format. The characters to be used in writing the format string are described in the node help. The default value is "HH:mm:ss"

Inputs and outputs

Inputs: input data, metadata changes.

Outputs: changed data, errors.