Handling Java node inputs and outputs - Latest

Data360 Analyze Server Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 Analyze
Version
Latest
Language
English
Product name
Data360 Analyze
Title
Data360 Analyze Server Help
Copyright
2024
First publish date
2016
Last updated
2024-11-28
Published on
2024-11-28T15:26:57.181000

This section details how the Java node code can be written to handle node inputs and outputs.

For information on the Java API, see the "laeJavaApi" Zip file which you will find in the following location: <Data360Analyze installation directory>/docs/lae

Handling node input

Opening inputs

The default code in the JavaCode property of the Java node is an example implementation. In general, the inputs are already opened and no work needs to be done in the Java node code.

Finding input fields

If you want to locate a specific field within an input, it is necessary to find the (zero-indexed) index of the field within the input metadata, using the name of the field to search.

For example, the following code is used to locate a field with the name specified in the InputFieldName property within the ith input, and error if the field does not exist on this input:

int idx = input(i).metadata().find(m_inputColumnName);

if ( idx == -1) {

logger().error(Logger.CHAIN_END, "Unable to find required field ("+m_inputFieldName+") on input ("+i+"): \""+input(i).name()+"\"");

throw fail();

}

This code obtains the metadata for the ith input, then searches this metadata for the field with the name m_inputFieldName.

If the field cannot be found in the metadata, then a result of -1 is returned. In this case, the node will error.

Similarly, the index of an input or output can be found using the findInput and findOutput methods respectively as defined on the Node interface.

It is often the case that the same field will need to be read from each record in an input. When this is the case, it is recommended to store the field index (idx in the above code), such that it can be used without re-searching the record metadata each time. This should be performed in the setup method of the node.

Reading records

Note: Record processing should always be performed in the processAll method, see ProcessAll.

In the following example, we want to read records from each input that still has records remaining, until all inputs have been completely read:

Record record = read(i);if (record != RecordInput.EOF) {…}

This simply reads the next available record from the ith input. If no more records are available from this input, then RecordInput.EOF will be returned.

Therefore, if we only had one input and simply wanted to continue processing records until this input had no more records to read, the following could be used instead:

Record record = null;while ((record = read(0)) != RecordInput.EOF) {

//process record

}

Once the record has been obtained, and we have verified that this record is not an end of input indicator, then it is straightforward to get a field from this record.

To obtain the first field defined on a record, then the following can be used:

Object field = record.field(0);

However, since field ordering is not guaranteed, it is better to use the name of the field and obtain the index of that field in the metadata. To get the field named "Foo" from the first input, the following can be used:

int index = input(0).metadata().find("Foo");Object field = record.field(index);

If this is to be done repeatedly, the index should be stored in a variable so that the metadata does not need to be searched every time.

The following example obtains the {{^InputFieldName^}} field from the ith input:

record.field(m_indices.get(i))

Note: The record.field methods will always return a java.lang.Object. Therefore, to perform useful processing with the returned field, it will generally be necessary to cast the field to a different type.

When performing these operations, it is recommended to ensure that the metadata on the input is correct for the type you are casting to. For example, to ensure that the field "Foo" in input 0 contains integer data, then the following code could be used to ensure that this is the case:

int index = input(0).metadata().find("Foo");FieldMetadata fieldMd = input(0).metadata().field(index);

Class<?> clazz = fieldMd.type();

if (!java.lang.Integer.isAssignableFrom(clazz)) {

//error

}

Note: The above code will only handle Integer types, and will still error for other data types such as Byte and Long.

Null field data

If a field has not been set on an input record, the data viewer will display the value as "NULL". When accessing a field that has not been set on a record, the returned object will be: com.lavastorm.lang.Null.NULL.

Therefore, when reading the field at the index "index" from a record, the following will check that this field value is set on the record:

Object field = record.field(index);

if (field == com.lavastorm.lang.Null.NULL) {

//handle the case where the field is not set

}

Closing inputs

In general, inputs should be closed within the cleanup method, see Cleanup. Similarly, the setup method should ensure that if it fails, all inputs have subsequently been closed, as the cleanup method will not be called.

Note: Closing an input can cause an IOException to be thrown. Therefore it is important to handle this exception as described in Logging and handling Java errors.

The helper method cleanupIo can be used to close all of the open inputs and outputs and handle any IOExceptions that might be thrown. This will also correctly handle all of the required logging.

Any inputs that have been opened during the setup method should in general be closed in the cleanup method. Closing an input is a very simple operation, and the following code will close the ith input (0-indexed):

input(i).close(false);

The boolean parameter provided to the close method specifies whether or not the input is being closed due to an error.

The close method shown above can throw IOExceptions. To simplify the cleanup of all I/O, it is recommended that the cleanup method calls cleanupIo. This is the default implementation of cleanup provided in the JavaCode property in the Java node and will close any open inputs and outputs, and correctly log any IOExceptions that get thrown, causing the node to error.

Handling node output

This section details how the Java node can be written to handle the node outputs.

Setting output metadata

Note: Wherever possible, outputs should have their metadata set in the setup method, see Setup. If the metadata is dependent on data from input records, the metadata should be set in the processAll method.
CAUTION:
When setting up the field metadata using the SimpleFieldMetadata constructors, ensure that you take note of the signed argument. If no signed argument is provided, SIGN_AGNOSTIC is assumed. This is the correct usage when using String, Unicode, Boolean, Date, DateTime and Time field types. However, when using any numeric formats, you must specify the sign argument. In general, this will be using the FieldMetadata.SIGNED (or SimpleFieldMetadata.SIGNED). This means that for Byte, Short, Integer, Long, Float and Double field types, the 3-argument constructor to the SimpleFieldMetadata constructor is required. If no sign argument is provided in these cases, SIGN_AGNOSTIC will be used. The SIGN_AGNOSTIC implementation for these field types is undefined.

The first operation performed on node inputs is normally to open them. Before an output is opened, the metadata first needs to be defined on an output.

There are a number of different implementations of the RecordMetadata interface. When dealing with node I/O, the constructors on these implementation classes should not be used. Rather, a new metadata object can be obtained off the node output, using:

RecordMetadata metadata = output(outputIdx).newMetadata();

This will ensure that the correct RecordMetadata implementation will be constructed for the node output being used.

Constructing new metadata

The output record metadata is constructed and set as part of the setup method. In this example, the OutputAsFloat and OutputFieldName properties are used to determine the output metadata.

When OutputAsFloat is set to True, the OutputFieldName property is set to be of a floating point type. Otherwise, the output metadata is setup with an integer type, as shown in the following code:

//Setup the output metadata according to the properties.

Class<?> outputType = null;

if (m_outputAsFloat)

outputType = java.lang.Float.class;

else

outputType = java.lang.Integer.class;

RecordMetadata metadata = output(0).newMetadata();

metadata.add(new SimpleFieldMetadata(m_outputColumnName, outputType, SimpleFieldMetadata.SIGNED));

output(0).metadata(metadata);

The newMetadata call on the RecordOutput constructs a new RecordMetadata object. This is then populated with new FieldMetadata objects (in this case, SimpleFieldMetadata is used). Once all of the required field metadata has been added to the RecordMetadata, the metadata can be set on the RecordOutput. After the metadata has been set on the RecordOutput, no additional field metadata can be added to the RecordMetadata object.

Reusing metadata

While constructing new metadata allows for full control of all fields in the output metadata, it may be that the output metadata should simply be the same as the metadata on an input. In this case, the RecordMetadata.copyFrom method can be used. The code below shows how this can be done for setting the metadata for the first output to the same as the metadata for the first input:

#Setup the output metadataRecordMetadata metadata = output(0).newMetadata();metadata.copyFrom(input(0).metadata());output(0).metadata(metadata);

Similarly, the following example shows how to set the metadata to be used on multiple outputs:

RecordMetadata metadata0 = output(0).newMetadata();//construct the metadata for output 0 here…output(0).metadata(metadata0);RecordMetadata metadata1 = output(1).newMetadata();metadata1.copyFrom(metadata0);output(1).metadata(metadata1);

Note: It is important to note that the same RecordMetadata object cannot be used on multiple outputs. Rather, a different RecordMetadata object needs to be constructed for each output.

Opening outputs

Note: Wherever possible, outputs should be opened in the setup method, see Setup. Where the metadata is dependent on data read from input records, then the outputs will need to be opened in the processAll method, see ProcessAll.

Once the metadata has been set on an output, the output can be opened. The output must be opened prior to attempting to write to it. Opening an output is a very simple operation, and the following code will open the first output:

openOutput(0);

If multiple outputs are being used, and all need to be opened at the same time (after the metadata has been set on each output), then the following code can be used:

openOutputs();

Writing records

Record writing should generally be performed in the processAll method, see ProcessAll.

Records can only be written to an output after the output has been opened. It is a relatively straightforward process to write records within a Java node. First, a new record is obtained from the output metadata. Then, on the returned record, each of the fields can be populated prior to writing the record to the output.

The following code illustrates how to write a simple record to the first output with one field set (where the variable sum is defined to be a double).

Record record = output(0).metadata().newRecord();

if (m_outputAsFloat)

record.field(0, sum);

else

record.field(0, (int)sum);

write(0, record);

Each field which is not set on a record prior to the record being written will appear as "NULL" in the data viewer. If in the above example, the first output was defined with metadata containing two fields, the second field would be left as "NULL".

Closing outputs

In general, outputs should be closed within the cleanup method, see Cleanup. Similarly, the setup method should ensure that if it fails, any outputs it has opened have subsequently been closed, as the cleanup method will not be called.

Note: Closing an output can cause an IOException to be thrown. Therefore it is important to handle this exception as described in Logging and handling Java errors.

Any outputs that have been opened during the setup method should in general be closed in the cleanup method.

Closing an output is a very simple operation, and the following code will close the first output:

output(0).close(false);

The boolean parameter provided to the close method specifies whether or not the output is being closed due to an error.

The close method shown above can throw IOExceptions. In order to simplify the cleanup of all I/O, it is recommended that the cleanup method simply calls cleanupIo. This is the default implementation of cleanup provided in the JavaCode property of the Java node. The cleanupIo method closes any open inputs and outputs. If any IOExceptions get thrown during this process, the cleanupIo will log them appropriately and the node will error.