Field Parser - 23.1

Spectrum Dataflow Designer Guide

Version
23.1
Language
English
Product name
Spectrum Technology Platform
Title
Spectrum Dataflow Designer Guide
First publish date
2007
Last updated
2024-05-09
Published on
2024-05-09T23:01:03.226155

The Field Parser stage extracts fields from XML and delimited data in the specified input column. To configure the Field Parser options, perform the following tasks.

  1. From the Source field select the column that has the XML or delimited data to be parsed.
    Note: The drop-down displays all the string input columns.
  2. Select the XML or Delimited Format based on the type of data you want to parse, and accordingly, select the options described below.

Field Parser Options for XML Data

Option Name Description
Server name Indicates whether the file selected for inferring the schema is located on the computer running the Spectrum Enterprise Designer or on the server. If you select a file on the local computer, the server name will be My Computer. If you select a file on the server the server name will be Spectrum Technology Platform.
Schema file

Specifies the path to an XSD schema file. Click the ellipses button (...) to navigate to the file location. The schema file can reside on the server or your local system.

Alternatively, you can also specify an XML file instead of an XSD file. If you specify an XML file the schema will be inferred based on the structure of the XML file. Using an XML file instead of an XSD file has the following limitations:

  • The XML file cannot be larger than 1 MB. If the XML file is more than 1 MB in size, try removing some of the data while maintaining the structure of the XML.
  • The data file will not be validated against the inferred schema.
Note: If the Spectrum Technology Platform server is running on Linux, remember that file names and paths on these platforms are case sensitive.
Output Fields

This section displays details of the selected schema. It includes the root element followed by the child elements along with their attributes.

By default all the nodes of the schema remain selected. However, you can clear the check-box of the nodes that you do not want to be passed to the next stage.
  • Search node: Type the name of the node to which you want to navigate in the schema tree. The typed node gets highlighted in the preview pane below the field.
  • XPath: Click anywhere in this field to view the XML path (XPath) of the elements and attributes of the highlighted node in schema tree. To see all the previous XPaths viewed by you, click the down arrow at the right end of the field.
    Note: XPath is a language for finding information in an XML document. For further details on this, see https://www.w3schools.com/xml/xml_xpath.asp

Field Parser Options for Delimited Data

Option Name Description
Field separator From the dropdown list, select the field separator used in the delimited column to be parsed.

If the delimited column uses a different character as a field separator, click the ellipses button to select another character as field separator.

Text qualifier

From the dropdown list, select the text qualifier used in the delimited column to be parsed.

Note: Text qualifiers are the character used to surround text values in a delimited data.

If the delimited column uses a different text qualifier, click the ellipses button to select another character as a text qualifier.

Output type

Select if you want the parsed output in the form of a List (hierarchical display of values) or Fields.

Note: For list as the output type, you can add only one output field, whereas the Fields option allows you to add multiple fields in which you can get the values segregated during parsing.
Output Fields

This section allows you to add/modify the various fields in which you want details of the delimited column to be segregated. You can also delete any of the added output fields.

To add a new field for displaying the parsed output, click the Add button, and perform these steps in the Field Setting pop-up that is displayed:
  1. Enter the Name of the field.
  2. From the Type drop-down, select the data type for the field being added. Based on the selected type, few more fields can be defined. For example, in case of date, you can define its format as M/d/yy, MMM d.yyyy, or MMMM d.yyyy. For details on the data types and defining its details, see Defining Fields In a Delimited Input File.
    Note: If you select String as the data type, any type of delimited data will be parsed. However, you can also use the specific type, based on the data you want to parse in the field.
  3. In the Position field, enter the position of the data type (in the input file) that is to be parsed to this field. For example, in the following file snippet, if you want to parse the date time values to the field being added, enter the Position as 3.
    true;"02/02/2022";"10/2/92 5:05 AM";598985994665542.25634;1;
    "Arjun";74785.155;5:05PM,1,Deepak,65152
    false;"15/03/1923";"3/23/90 11:55 AM";3425699466554.2563;2;
    "sharma";5.1;5:45AM,2,Arjun,365273          
  4. Click Add Field and Close.
The added field and its details are displayed in the box.
Note: If you want to have any excess space characters removed from the beginning and end of a field's value string, select the Trim check box.

Modify: Click this button to modify details of any of the added output fields.

Remove: Click this button to delete any of the added output fields.

Runtime: Use this button to specify multiple runtime instances of parser. This results in significant performance improvement.

OK: Click this button to save all the details entered in this stage.

Cancel: Click this button to cancel all the updates you made.

Help: Click this button to refer to the help file for this stage.