A pivot table summarizes data for easier analysis by creating table row and
column categories based on input data. For more information, see Pivot Tables.
In the
Group
Statistics stage options:
-
In the Operations tab, select a field from
Input Fields which contains the data you want to use
as the row labels in your pivot table. Then click the >>
button next to the Rows field.
-
Select a field that contains the data you want to use as the columns in your
pivot table then click the >> button next to the
Columns field.
Tip: At this point, run inspection to see the results of your
selections. This will help you visualize the results of the cross
tabulations based on the columns and rows you have selected.
-
To skip sorting the input records, check Rows and Columns are
presorted in the configured order.
If this field is checked, the stage processes the input records without
sorting them.
Note: Check this if the records are already sorted.
-
To define the operation to be performed, click the >>
button next to the Operations field.
In the Add Operation window:
-
Select the Operation to be performed.
-
In the Input Field section, select the
Name and Type of the
input field on which the operation must be performed.
-
In the Output Field section, enter the
Name and select the
Type of the output field to be generated once
the operation is performed.
-
To fetch the actual count of input records on which the operation is
performed as a separate output column, check Get count of
records that are computed upon.
Records with
null
values are not included in the
count
ComputationalCount<Operation>Of<InputFieldName>
.
Functions
on which Computational Count is supported:
- Average
- Variance
- ZScore
- Standard Deviation
- Percentile
- Percent Rank
- Sum
For any other operation, this check box remains
disabled.
-
To define the output fields for each column in the pivot table, click the
Fields tab of the stage options.
Tip: In order to define fields accurately, run an inspection flow
once, before this step, to see the column names generated by your data.
-
Click Add to display the Add
Field window.
-
In the Add Field window, the grid columns are
based on the Columns fields you chose in the
Operations tab. In these grid columns, enter
those values that you see as the column headings on running an
inspection flow.
The records in the column Data can also be populated in one go by
using the Import feature. To import data from a CSV or TXT file:
- Click Import
- Browse the source file using File name field
- Enter the Field and Record Separator
values
- Click OK
All the records in the file get populated in the Column Data
table.
Note: The source file should not have any header row.
For example, if you selected an input field called ShipDat
e
in Columns in the
Operations tab, the grid in the Add
Field window would have a column labeled
ShipDate
. In this grid column, enter the exact
ShipDate
values present in your flow input data,
such as 2/25/2010
,
1/31/2010
.
-
In the Operation field, select one or more
operations for which output columns are generated for each entered
column field value. Note that the operation you select only affects the
field name and does not control the actual calculation.
To change the operations listed in the
Operation field of the
Fields tab, modify the
Operations field values in the
Operations tab.
Attention: The Computational Count operation option ComputationalCountOperationOfInputFieldName
is listed only if the Get count of records that are computed upon check box is selected while defining the Operation in the Operations tab.
-
Click Add.
-
Click OK.
For each input value you entered in the grid above, output columns are automatically created by mapping those against each of the selected Operation values. A Cartesian product of the entered input column values in the grid and the selected Operations is used to automatically generate the final output columns.
The names of these output columns follow the naming convention
Data_OperationOfInputFieldName
,
where Data
is the value you specified in the
first field, Operation
is the operation you
selected in the Operation field, and
InputFieldName
is input column on which
the Operation
is performed.