This section details how the Java node code can be written to handle node inputs and outputs.
For information on the Java API, see the "laeJavaApi" Zip file which you will find in the following location: <Data360Analyze installation directory>/docs/lae
Handling node input
Opening inputs
The default code in the JavaCode property of the Java node is an example implementation. In general, the inputs are already opened and no work needs to be done in the Java node code.
Finding input fields
If you want to locate a specific field within an input, it is necessary to find the (zero-indexed) index of the field within the input metadata, using the name of the field to search.
For example, the following code is used to locate a field with the name specified in the InputFieldName property within the i
th input, and error if the field does not exist on this input:
int idx = input(i).metadata().find(m_inputColumnName);
if ( idx == -1) {
logger().error(Logger.CHAIN_END, "Unable to find required field ("+m_inputFieldName+") on input ("+i+"): \""+input(i).name()+"\"");
throw fail();
}
This code obtains the metadata for the i
th input, then searches this metadata for the field with the name m_inputFieldName
.
If the field cannot be found in the metadata, then a result of -1 is returned. In this case, the node will error.
Similarly, the index of an input or output can be found using the findInput
and findOutput
methods respectively as defined on the Node
interface.
It is often the case that the same field will need to be read from each record in an input. When this is the case, it is recommended to store the field index (idx
in the above code), such that it can be used without re-searching the record metadata each time. This should be performed in the setup method of the node.
Reading records
In the following example, we want to read records from each input that still has records remaining, until all inputs have been completely read:
Record record = read(i);if (record != RecordInput.EOF) {…}
This simply reads the next available record from the i
th input. If no more records are available from this input, then RecordInput.EOF
will be returned.
Therefore, if we only had one input and simply wanted to continue processing records until this input had no more records to read, the following could be used instead:
Record record = null;while ((record = read(0)) != RecordInput.EOF) {
//process record
}
Once the record has been obtained, and we have verified that this record is not an end of input indicator, then it is straightforward to get a field from this record.
To obtain the first field defined on a record, then the following can be used:
Object field = record.field(0);
However, since field ordering is not guaranteed, it is better to use the name of the field and obtain the index of that field in the metadata. To get the field named "Foo" from the first input, the following can be used:
int index = input(0).metadata().find("Foo");Object field = record.field(index);
If this is to be done repeatedly, the index should be stored in a variable so that the metadata does not need to be searched every time.
The following example obtains the {{^InputFieldName^}} field from the i
th input:
record.field(m_indices.get(i))
record.field
methods will always return a java.lang.Object
. Therefore, to perform useful processing with the returned field, it will generally be necessary to cast the field to a different type.When performing these operations, it is recommended to ensure that the metadata on the input is correct for the type you are casting to. For example, to ensure that the field "Foo" in input 0 contains integer data, then the following code could be used to ensure that this is the case:
int index = input(0).metadata().find("Foo");FieldMetadata fieldMd = input(0).metadata().field(index);
Class<?> clazz = fieldMd.type();
if (!java.lang.Integer.isAssignableFrom(clazz)) {
//error
}
Null field data
If a field has not been set on an input record, the data viewer will display the value as "NULL". When accessing a field that has not been set on a record, the returned object will be: com.lavastorm.lang.Null.NULL
.
Therefore, when reading the field at the index "index" from a record, the following will check that this field value is set on the record:
Object field = record.field(index);
if (field == com.lavastorm.lang.Null.NULL) {
//handle the case where the field is not set
}
Closing inputs
In general, inputs should be closed within the cleanup
method, see Cleanup. Similarly, the setup
method should ensure that if it fails, all inputs have subsequently been closed, as the cleanup
method will not be called.
IOException
to be thrown. Therefore it is important to handle this exception as described in Logging and handling Java errors.The helper method cleanupIo
can be used to close all of the open inputs and outputs and handle any IOExceptions
that might be thrown. This will also correctly handle all of the required logging.
Any inputs that have been opened during the setup
method should in general be closed in the cleanup
method. Closing an input is a very simple operation, and the following code will close the i
th input (0-indexed):
input(i).close(false);
The boolean parameter provided to the close method specifies whether or not the input is being closed due to an error.
The close method shown above can throw IOExceptions
. To simplify the cleanup of all I/O, it is recommended that the cleanup
method calls cleanupIo
. This is the default implementation of cleanup
provided in the JavaCode property in the Java node and will close any open inputs and outputs, and correctly log any IOExceptions
that get thrown, causing the node to error.
Handling node output
This section details how the Java node can be written to handle the node outputs.
Setting output metadata
setup
method, see Setup. If the metadata is dependent on data from input records, the metadata should be set in the processAll
method.SimpleFieldMetadata
constructors, ensure that you take note of the signed argument. If no signed argument is provided, SIGN_AGNOSTIC
is assumed. This is the correct usage when using String, Unicode, Boolean, Date, DateTime and Time field types. However, when using any numeric formats, you must specify the sign
argument. In general, this will be using the FieldMetadata.SIGNED
(or SimpleFieldMetadata.SIGNED
). This means that for Byte, Short, Integer, Long, Float and Double field types, the 3-argument constructor to the SimpleFieldMetadata
constructor is required. If no sign
argument is provided in these cases, SIGN_AGNOSTIC
will be used. The SIGN_AGNOSTIC
implementation for these field types is undefined.The first operation performed on node inputs is normally to open them. Before an output is opened, the metadata first needs to be defined on an output.
There are a number of different implementations of the RecordMetadata
interface. When dealing with node I/O, the constructors on these implementation classes should not be used. Rather, a new metadata object can be obtained off the node output, using:
RecordMetadata metadata = output(outputIdx).newMetadata();
This will ensure that the correct RecordMetadata
implementation will be constructed for the node output being used.
Constructing new metadata
The output record metadata is constructed and set as part of the setup
method. In this example, the OutputAsFloat and OutputFieldName properties are used to determine the output metadata.
When OutputAsFloat is set to True, the OutputFieldName property is set to be of a floating point type. Otherwise, the output metadata is setup with an integer type, as shown in the following code:
//Setup the output metadata according to the properties.
Class<?> outputType = null;
if (m_outputAsFloat)
outputType = java.lang.Float.class;
else
outputType = java.lang.Integer.class;
RecordMetadata metadata = output(0).newMetadata();
metadata.add(new SimpleFieldMetadata(m_outputColumnName, outputType, SimpleFieldMetadata.SIGNED));
output(0).metadata(metadata);
The newMetadata
call on the RecordOutput
constructs a new RecordMetadata
object. This is then populated with new FieldMetadata
objects (in this case, SimpleFieldMetadata
is used). Once all of the required field metadata has been added to the RecordMetadata
, the metadata can be set on the RecordOutput
. After the metadata has been set on the RecordOutput
, no additional field metadata can be added to the RecordMetadata
object.
Reusing metadata
While constructing new metadata allows for full control of all fields in the output metadata, it may be that the output metadata should simply be the same as the metadata on an input. In this case, the RecordMetadata.copyFrom
method can be used. The code below shows how this can be done for setting the metadata for the first output to the same as the metadata for the first input:
#Setup the output metadataRecordMetadata metadata = output(0).newMetadata();metadata.copyFrom(input(0).metadata());output(0).metadata(metadata);
Similarly, the following example shows how to set the metadata to be used on multiple outputs:
RecordMetadata metadata0 = output(0).newMetadata();//construct the metadata for output 0 here…output(0).metadata(metadata0);RecordMetadata metadata1 = output(1).newMetadata();metadata1.copyFrom(metadata0);output(1).metadata(metadata1);
RecordMetadata
object cannot be used on multiple outputs. Rather, a different RecordMetadata
object needs to be constructed for each output.Opening outputs
setup
method, see Setup. Where the metadata is dependent on data read from input records, then the outputs will need to be opened in the processAll
method, see ProcessAll.Once the metadata has been set on an output, the output can be opened. The output must be opened prior to attempting to write to it. Opening an output is a very simple operation, and the following code will open the first output:
openOutput(0);
If multiple outputs are being used, and all need to be opened at the same time (after the metadata has been set on each output), then the following code can be used:
openOutputs();
Writing records
Record writing should generally be performed in the processAll
method, see ProcessAll.
Records can only be written to an output after the output has been opened. It is a relatively straightforward process to write records within a Java node. First, a new record is obtained from the output metadata. Then, on the returned record, each of the fields can be populated prior to writing the record to the output.
The following code illustrates how to write a simple record to the first output with one field set (where the variable sum is defined to be a double).
Record record = output(0).metadata().newRecord();
if (m_outputAsFloat)
record.field(0, sum);
else
record.field(0, (int)sum);
write(0, record);
Each field which is not set on a record prior to the record being written will appear as "NULL" in the data viewer. If in the above example, the first output was defined with metadata containing two fields, the second field would be left as "NULL".
Closing outputs
In general, outputs should be closed within the cleanup
method, see Cleanup. Similarly, the setup
method should ensure that if it fails, any outputs it has opened have subsequently been closed, as the cleanup
method will not be called.
IOException
to be thrown. Therefore it is important to handle this exception as described in Logging and handling Java errors.Any outputs that have been opened during the setup
method should in general be closed in the cleanup
method.
Closing an output is a very simple operation, and the following code will close the first output:
output(0).close(false);
The boolean parameter provided to the close method specifies whether or not the output is being closed due to an error.
The close method shown above can throw IOExceptions
. In order to simplify the cleanup of all I/O, it is recommended that the cleanup
method simply calls cleanupIo
. This is the default implementation of cleanup
provided in the JavaCode property of the Java node. The cleanupIo
method closes any open inputs and outputs. If any IOExceptions
get thrown during this process, the cleanupIo
will log them appropriately and the node will error.