Data Converter (Deprecated) - Data360_Analyze - Latest

Data360 Analyze Server Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 Analyze
Version
Latest
Language
English
Product name
Data360 Analyze
Title
Data360 Analyze Server Help
Copyright
2024
First publish date
2016
Last updated
2024-11-28
Published on
2024-11-28T15:26:57.181000

This deprecated node discovers the best data type for each field in the input data set and outputs the data on the convertedData output pin. Converts the data type of each field based on the discovered data type.

CAUTION:
This node has been deprecated and will not be supported in a future release. As an alternative, the Modify Fields node can be used to provide similar functionality.

The node treats empty string values as NULLs by default. An empty string is a string or Unicode value which is either of zero length or all whitespace. If the EmptyStringIsNull property is set to False, empty strings are treated as distinct string values and remain unchanged on the convertedData output pin.

In addition to this, if a string value in the data set contains leading zeros e.g. ZIP Code or an ID, the ConvertLeadingZeros property determines if this should be converted to a numeric type (integer, long or double) for analysis. Generally, leading zeros are an indicator that the data is not a number. By default, data type conversion does not apply for string values containing leading zeros.

Automatic sampling can be applied by the node when the UseAutoSampling property is set to True. If set to True, the node selects and analyzes a subset of data to identify patterns and then applies the discovered data types to the entire data set. The performance of the node may also be increased when automatic sampling is applied, which can be useful for large data sets. By default, automatic sampling is not applied.

The node provides two output pins: convertedData and analysis.

The analysis output pin contains three fields:

1. Field Name displays the name of each field from the input data set.

2. Current Type displays the original data type of each field.

3. Discovered Type displays the discovered data type for each field if the input field is of type string or Unicode.

The convertedData output pin contains the input data converted to the best type detected in the data. This conversion is only performed on fields of type string or Unicode.

To calculate the converted data, the coercions are attempted in the following order: Integer, Long, Double, Boolean, Date, Time.

When the node arrives at the Date type, the coercions are attempted in the following order: CCYY/M/D, CCYY/D/M, M/D/CCYY, D/M/CCYY, Y/M/D, Y/D/M, M/D/Y, D/M/Y, CCYY-M-D, CCYY-D-M, M-D-CCYY, D-M-CCYY, Y-M-D, Y-D-M, M-D-Y, D-M-Y. For more information, see date.

When the node arrives at the Time type, the coercions are attempted in the following order: H:M:SPM, H:M:S PM, H:M:S, H:MPM, H:M PM, H:M. For more information, see time.

The node supports the following datetime formats:

Format Example
yyyy-MM-dd HH:mm:ss 2015-01-02 00:00:01
yyyy-MM-dd'T'HH:mm:ss 2015-01-02T00:00:01
yyyy-MM-dd'T'HH:mm:ss.SSS 2015-01-02T00:00:01.001
yyyyMMdd'T'HHmmss 20150102T000001
yyyyMMdd'T'HHmmss.SSS 20150102T000001.001
yyyy-MM-dd'T'HH:mm:ssZ 2015-01-02T00:00:01+0800
yyyy-MM-dd'T'HH:mm:ss.SSSZ 2015-01-02T00:00:01.001-08:00
yyyy-MM-dd'T'HH:mm:ss'Z' 2015-01-02T00:00:01Z
yyyy-MM-dd'T'HH:mm:ss.SSS'Z' 2015-01-02T00:00:01.001Z
yyyy-MM-dd'T'HH:mm:ss.SSS'Z' 2015-01-02T00:00:01.001Z
MM/dd/yyyy HH:mm:ss 01/02/2015 00:00:01
dd-MM-yyyy HH:mm:ss 01-01-2015 00:00:01
dd MMM yyyy HH:mm:ss 01 JAN 2015 00:00:01
dd MMMM yyyy HH:mm:ss 01 JANUARY 2015 00:00:01

All datetimes are displayed in local time. If the datetime does not contain timezone offset information, the timezone is assumed to be local. If the datetime does contain timezone offset information, the Data Converter node will take this into account when processing and adjust the value to local time, accounting for daylight savings if required.

A Create Data node contains the following string field in ISO8601 format, including timezone offset information from UTC:

DateTime:

string

2017-01-01T13:00:01.000+01:00
2017-02-02T14:05:00.000+00:00
2017-06-06T13:00:00.000-02:00

Running the Data Converter node adjusts the records to local datetime values:

Record ID

DateTime:

datetime

1 2017-01-01T12:00:01+0000
2 2017-02-02T14:05:00+0000
3 2017-06-06T16:00:00+0100

1) Timezone offset +1 hour - time adjusted back by 1 hour to change to local UKtime (GMT).

2) No timezone offset - time assumed to be local, therefore no change to value.

3) Timezone offset -2 hours - time adjusted forward by 3 hours to change to local UK time (BST), with an additional hour to account for daylight savings.

Properties

EmptyStringIsNull

Optionally specify how to interpret strings that are either zero length or all whitespace.

If set to True, strings that are either zero length or all whitespace are interpreted as NULL. If False, they are treated as distinct string values.

The default value is True.

ConvertLeadingZeros

Optionally specify if a string that contains leading zeros should be converted to a numeric type (int, long, or double) for analysis. Generally, leading zeros are an indicator that the data is not a number. For example, it could be a ZIP Code or an ID.

The default value is False.

UseAutoSampling

Optionally specify whether automatic sampling is to apply.

If set to True the node samples the input data set, discovers the best data type and applies this data type to the entire input data set.

The default value is False.

Inputs and outputs

Inputs: data.

Outputs: convertedData, analysis.