Creating a Pivot Table - 23.1

Spectrum Dataflow Designer Guide

Version
23.1
Language
English
Product name
Spectrum Technology Platform
Title
Spectrum Dataflow Designer Guide
First publish date
2007
Last updated
2024-05-09
Published on
2024-05-09T23:01:03.226155

A pivot table summarizes data for easier analysis by creating table row and column categories based on input data. For more information, see Pivot Tables.

In the Group Statistics stage options:
  1. In the Operations tab, select a field from Input Fields which contains the data you want to use as the row labels in your pivot table. Then click the >> button next to the Rows field.
  2. Select a field that contains the data you want to use as the columns in your pivot table then click the >> button next to the Columns field.
    Tip: At this point, run inspection to see the results of your selections. This will help you visualize the results of the cross tabulations based on the columns and rows you have selected.
  3. To skip sorting the input records, check Rows and Columns are presorted in the configured order.
    If this field is checked, the stage processes the input records without sorting them.
    Note: Check this if the records are already sorted.
  4. To define the operation to be performed, click the >> button next to the Operations field.
    In the Add Operation window:
    1. Select the Operation to be performed.
    2. In the Input Field section, select the Name and Type of the input field on which the operation must be performed.
    3. In the Output Field section, enter the Name and select the Type of the output field to be generated once the operation is performed.
    4. To fetch the actual count of input records on which the operation is performed as a separate output column, check Get count of records that are computed upon.
      Records with null values are not included in the count ComputationalCount<Operation>Of<InputFieldName>.

      Functions on which Computational Count is supported:
      • Average
      • Variance
      • ZScore
      • Standard Deviation
      • Percentile
      • Percent Rank
      • Sum
      For any other operation, this check box remains disabled.
  5. To define the output fields for each column in the pivot table, click the Fields tab of the stage options.
    Tip: In order to define fields accurately, run an inspection flow once, before this step, to see the column names generated by your data.
    1. Click Add to display the Add Field window.
    2. In the Add Field window, the grid columns are based on the Columns fields you chose in the Operations tab. In these grid columns, enter those values that you see as the column headings on running an inspection flow.
      The records in the column Data can also be populated in one go by using the Import feature. To import data from a CSV or TXT file:
      1. Click Import
      2. Browse the source file using File name field
      3. Enter the Field and Record Separator values
      4. Click OK
      All the records in the file get populated in the Column Data table.
      Note: The source file should not have any header row.
      For example, if you selected an input field called ShipDat e in Columns in the Operations tab, the grid in the Add Field window would have a column labeled ShipDate. In this grid column, enter the exact ShipDate values present in your flow input data, such as 2/25/2010, 1/31/2010.
    3. In the Operation field, select one or more operations for which output columns are generated for each entered column field value. Note that the operation you select only affects the field name and does not control the actual calculation.
      To change the operations listed in the Operation field of the Fields tab, modify the Operations field values in the Operations tab.
      Attention: The Computational Count operation option ComputationalCountOperationOfInputFieldName is listed only if the Get count of records that are computed upon check box is selected while defining the Operation in the Operations tab.
    4. Click Add.
  6. Click OK.

For each input value you entered in the grid above, output columns are automatically created by mapping those against each of the selected Operation values. A Cartesian product of the entered input column values in the grid and the selected Operations is used to automatically generate the final output columns.

The names of these output columns follow the naming convention Data_OperationOfInputFieldName, where Data is the value you specified in the first field, Operation is the operation you selected in the Operation field, and InputFieldName is input column on which the Operation is performed.