Get MongoDB Data - Data360_Analyze - Latest

Data360 Analyze Server Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 Analyze
Version
Latest
Language
English
Product name
Data360 Analyze
Title
Data360 Analyze Server Help
Copyright
2024
First publish date
2016
Last updated
2024-11-28
Published on
2024-11-28T15:26:57.181000

Converts JSON data output from Mongo queries to Data360 Analyze native tabular format.

See:

Properties

JsonData

Specify the source of JSON data that is to be read. A value is required for this property.

  • Choose the (from Filename) variant of this property to specify the name of file containing JSON data.
  • Choose the (from Filename Field) variant of this property to specify the name of an input field containing JSON file names.
  • Choose the (from Data Field) variant of this property to specify the name of the input field containing JSON data.

AllowBackslashEscapingAnyCharacter

Optionally specify whether or not the backslash ('/') character can be used to escape any other character. If set to true, unless the escape sequence is defined, the escaped character itself will be used. If set to false, only the following escapes can be used: \" \\ \/ \b \f \n \r \t \u.

The default value is True.

AllowComments

Optionally specify whether or not comments are allowed in the JSON data. Note that comments used to be supported in JSON but where removed in later updates to the JSON format. However, in order to support older JSON data files, this option is left to true by default. The comment styles allowed are standard C/C++ style comments ("//", "/**/"), using the formats:

/* Multi-LineComment

*/

data : { //Comment to end of line }

The default value is True.

AllowNonNumericNumbers

Optionally specify whether "Not-a-Number" (NaN) tokens are recognized as legal floating number values (similar to how many other data formats and programming language source code allows it).

The default value is True.

AllowNumericLeadingZeros

Optionally specify whether or not integral numbers are allowed to start with additional (ignorable) zeroes (like: 000001)

The default value is True.

AllowSingleQuotes

Optionally specify whether or not single quote characters (apostrophe, "'") are allowed for quoting strings in the JSON format.

The default value is True.

AllowUnquotedControlChars

Optionally specify whether or not unquoted control characters (ASCII characters with value less than 32, including tab and line feed characters) are allowed in strings.

Note: If set to true, these control characters will be ignored.

The default value is True.

AllowUnquotedFieldNames

Optionally specify whether or not unquoted field names are allowed in the JSON data.

The default value is True.

Charset

Optionally specify the character set to be used when processing the structured data. Some common examples are ASCII, UTF-8, UTF-16, UTF-32.

  • The default value depends on the source of the data to parse. If the data comes from an input field of type string, then the Data360 Analyze server character set is used as a default.
  • If the data comes from an input field of type unicode, then UTF-8 is used as a default.
  • If the data comes from a file, the default depends on the type of file being processed.

CoerceData

In some cases, the data being parsed is of a specific type (e.g. integer, boolean).

Optionally specify whether this data should be output in that type, or coerced to the CharacterDataOutputFieldType.

In certain cases, the same field may be present in the data with different types in different parts of the data. In these cases, there is no way to output the data in the type inferred from the input data without coercion. Choose from:

  • Never - The data will always be output in the inferred type from the parsed data. If there is a conflict - whereby the same field has different types in different parts of the parsed data, the node will error.
  • Always - All fields will be coerced and output as the CharacterDataOutputFieldType.
  • On Conflict - Wherever possible, the field will be output in the inferred type from the parsed data.

For fields which are present in the parsed data with different types in different parts of the data, the field will be coerced to the CharacterDataOutputFieldType.

The default value is Never.

OutputNestingCharacter

Optionally specify the character to use to identify hierarchical relationships in the output fields. This applies to both the field names in the output metadata and to the mapping of fields to output pins.

The default value is "."

Example: If the input data contained a field "Country" which contained a subfield "Population", and the OutputNestingCharacter was set to ".", then the sub-field "Country" would be output as "Country.Population". Similarly, in order to have this field mapped to a specific output, you can set the name of the output to "Country.Population".

OutputReferenceIds

Optionally specify whether or not reference identifiers should be output. This property only has effect when more than one output is present and receiving data from the data source.

In such cases, where hierarchical data is being flattened to multiple tabular outputs, the reference identifiers can be used to identify how the data in the different outputs is related.

These identifiers can then be used in subsequent join nodes if required to reassemble the data or identify the relationships between the different outputs.

The default value is True.

PassThroughFields

Optionally specify which input fields will "pass through" the node unchanged from the input to the output, assuming that the input exists. The input fields specified will appear on those output records which were produced as a result of the input fields. Choose from:

  • All - Passes through all the input data fields to the output.
  • None - Passes none of the input data fields to the output; as such, only the fields created by the node appear on the output.
  • Used - Passes through all the fields that the node used to create the output. Used fields include any input field referenced by a property, in this case, the "Filename Field" or "Data Field" if the data to parse is coming from such a field.
  • Unused - Passes through all the fields that the node did not use to create the output.

The default value is Unused.

Note: The pass through fields are only written to outputs which receive parsed data. These are not written to the Structure or Errors output pins.

In the case where for a given input record, the only data to be written to an output would be the pass through fields, whether or not these pass through fields are written depends on the property AlwaysEmitPassThroughFields.

RemoveCommonPrefixes

Optionally specify whether or not the node should attempt to rename the output fields by removing "common prefixes".

For example, in an output, where the only fields from the parsed data to be written to an output are "Top.Middle.First" and "Top.Middle.Second", if this property is set to true, then the fields will be output as "First" and "Second".

The default value is False.

Note: This only removes prefixes from the parsed data fields and not from pass through fields. It also does not affect any reference Id fields.

StructureOutput

Optionally specify the name of output pin to which the structure of the structured data will be written. The Structure output pin simply contains a record for each of the parsed data fields recognized from the input data source(s). The output contains:

  • The type of the field.
  • How the data was parsed from the input source (normally "Data Field").
  • The hierarchical name of the field in the structured data.
  • The name of the field in the output.
  • The name of the output to which the field was written.

In general, the output and input name of a field will be the same, unless RemoveCommonPrefixes is used, or the fields need to be renamed to be written in the BRD format (for example through the use of the SubstituteInvalidCharacters property)

The default value is "Structure".

If this property is set, then the corresponding output must exist. If this property is not set, but either the DefaultOutput or ErrorsOutput properties are set to "Structure", then the default of "Structure" in this property is ignored, and no structure output records will be written.

Note: While the "Structure" output pin exists by default, this can be renamed. In such cases, unless the StructureOutput property is changed to match the name of one of the nodes outputs, the node will not write any output structure records.

ErrorsOutput

Optionally specify the name of output pin to which errors will be written. The default value is "Errors".

If this property is set, then the corresponding output must exist. If this property is not set, but either the DefaultOutput or StructureOutput properties are set to "Errors", then the default of "Errors" in this property is ignored, and no error records will be written.

Note: While the "Errors" output pin exists by default, this can be renamed. In such cases, unless the ErrorsOutput property is changed to match the name of one of the nodes outputs, the node will not write any output error records.

DefaultOutput

The node will attempt to map each field in the structured data to an output that bears the name of the field. For fields that do not have the corresponding output, the node will map them to the output pin specified in this optional property.

The default value is "Data".

If this property is set, then the corresponding output must exist. If this property is not set, but either the ErrorsOutput or StructureOutput properties are set to "Data", then the default of "Data" in this property is ignored, and there is no DefaultOutput to handle unmapped fields.

Note: While the "Data" output pin exists by default, this can be renamed. In such cases, unless the DefaultOutput property is changed to match the name of one of the nodes outputs, the node will not have a DefaultOutput to handle unmapped fields.

If this property is not specified, and the default output of "Data" does not exist, then the property UnmappedFieldBehavior is used to determine the action to take when a parsed data field cannot be mapped to any output.

CharacterDataOutputFieldType

Optionally specify the type of the character based (string/unicode) output fields from the parsed data.

Note: This property does not affect the types of PassThrough fields. It also does not affect fields within the parsed data that have an inferred type which is not character based unless the property CoerceData is set to something other than Never. Choose from:
  • Auto - If the node parses data from an input field, the produced character fields from the structured contents will have the same type as the input field. If the node parses data from a file, the node will output character fields from the structured contents as unicode.
  • String - The fields will have string metadata.
  • Unicode - The fields will have unicode metadata.

The default value is Auto.

AlwaysEmitPassThroughFields

Optionally specify whether or not the pass through fields should always be written to outputs which receive them. Choose from:

  • True - Even if the parsing of an input record results in no data fields that are to be written to an output, the pass through fields will still be written. For instance, if the specified "Data Field" or "Filename Field" is NULL, a record will still be written containing the pass through fields.
  • False - The pass through fields will be written to the output only when there are other records to write to that output with parsed data fields.

The default value is False.

InputPrefix

Optionally specify a prefix to be added to the pass through fields.

The main objective for this property is to resolve the potential conflict where a node generated output field has the same name as an input field that the user wants to pass through.

Note that such a conflict will not happen if the user picks None for property PassThroughFields. However, in the absence of field name conflict, you may still want to highlight the pass through input fields by giving them a prefix.

The default value is none.

Example: Input contains fields EmployeeName, EmployeeAddress, id. The node generates output fields id and EmployeeDepartment. property PassThroughFields choice is All. InputPrefix is set to PassThrough.

The node will output fields: PassThrough.EmployeeName, PassThrough.EmployeeAddress, PassThrough.Id, id, and EmployeeDepartment.

NoRecordForOutputBehavior

Optionally specify the behavior of the node when none of the parsed data can be mapped to any output. Choose from:

  • Error - The node will throw an error and stop processing.
  • Log - The node will log a message and continue processing.
  • Ignore - The node will continue processing.

The default value is Error.

Note: This specifies the behavior when none of the parsed data can be mapped. Therefore, even if there are pass through fields from the input that could be written to an output, if no parsed fields from the data can be mapped to the output, then if this property is set to Error, the node will still error.

Not also that in the cases of Log and Ignore, if there are no pass through fields because either there is no input, or there are no fields to pass through, the node cannot set up the output metadata, and so it will throw an error and stop processing.

PassThroughFieldConflictBehavior

Optionally specify the behavior of the node when there are parsed data fields which conflict with pass through fields on any given output. Choose from:

  • Use PassThrough Field - The pass through field from the input will be written. The parsed data field will not be written to the output.
  • Use Data Field - The parsed data field will be written to the output. The pass through field from the input will not be written to output.
  • Error - The node will throw an error and stop processing.

The default value is Error.

UnmappedFieldBehavior

Optionally specify the behavior of the node when there are parsed data fields that cannot be mapped to any output, and there is no default output (refer to the property DefaultOutput) that collects all such fields. Note that if the default output exists, this situation will not happen. Choose from:

  • Error - The node will throw an error and stop processing.
  • Log - The node will log the situation and continue processing.
  • Ignore - The node will ignore the situation and continue processing.

The default value is Error.

NullValueBehavior

Optionally specify the behavior of the node when the data to be parsed is provided from a Data Field or Filename Field and in any of the input records, this field is null. Choose from:

  • Error - The node will throw an error for that record.
  • Log - The node will log the situation and continue processing.
  • Ignore - The node will ignore the situation and continue processing.

The default value is Error.

Note that this only affects the field containing the data to parse - either in a Data Field or Filename Field. This does not affect any other fields (including pass through fields) from the input.

ErrorThreshold

Optionally specify the number of transfer errors that will cause the node to give up and fail. Each record on the input pin is a "request". A transfer error is any error that causes a request to fail (e.g. a requested file does not exist). Setting this property instructs the node to continue processing requests as long as the number of errors remains below the given threshold.

An ErrorThreshold of 0 means never fail on a transfer error (the node will still fail on more serious errors).

The default value is 1 i.e. the node fails on the first error encountered.

SubstituteInvalidCharacters

Data360 Analyze data has some reserved characters which are not allowed to appear in the metadata.

Optionally specify what the node should do in case such characters appear in the input data and are to be used in the record metadata. Choose from:

  • True - The offending characters will be substituted for acceptable BRD metadata characters.
  • False - The node will throw an error and stop processing.

The default value is False.

Note: This only affects reserved characters in the metadata - such as colon (':') and newlines ('\n'). Data360 Analyze also requires that the metadata be in the same character set as the server's character set. If this property is set to true, the node does not attempt to perform any special substitution operations on characters from a different character set that cannot be mapped to the character set of the metadata.

Inputs and outputs

Inputs: Multiple optional (input fields).

Outputs: Data, Structure, Errors, multiple optional.