You can view data at any stage in a data flow, from input, through the various transformations, to the final export or publishing of the data. As you execute nodes, you can inspect their input or output to confirm that they are working as expected. If a node has multiple output pins, each pin will typically output a specific type of data.
Viewing data in the data viewer
- To expand the data viewer, click an input pin or an output pin that contains data (pins with data are green), or click the record count to the right of a node. You can hover over a pin to see its name.
- For each data set that you inspect, a new tab is created in the data viewer.
- The data viewer shows up to a maximum of the first 1000 records in any data set. For nodes which have been executed but produced no records, you can still view the corresponding metadata in the data viewer (field names and data types).
- If you re-run a node, or clear the status of a node for which you are currently viewing data in the data viewer, the data viewer will close to ensure that you are not viewing outdated data.
- To help you to organize and work with your data in the data viewer, row numbers are displayed on the far left of the rows, and in the bottom left corner of the screen, you will see a count of how many records are displayed, against the total that exist in the entire data set.
- You can reorder columns in the data viewer to make it easier to inspect the fields of interest in a data set. You can drag and drop columns to a new location, or you can choose one of the following options from the column menu: Move to Beginning, Move to End, Move Left or Move Right. The column order will be remembered for as long as a selected tab is open. However, if you close the tab and then re-open the data set, the original column order will be reset.
- You can resize the data viewer columns by hovering over the right border of the column until your cursor changes shape, then clicking and dragging the border to the required width. To make the width of a specific column fit the content, double-click the right column border, or select Resize to Fit from the column menu. To resize all columns to fit the content, select Resize All Columns to Fit from the menu in the top right corner of the data viewer.
- To resize the data viewer, drag the top edge to the required size.
- To collapse the data viewer, click the top edge.
Comparing data sets side by side
You can select a data set to view in a pop-out window, allowing you to compare two or more data sets side by side. For example, you might want to compare the input to a node with the output. Or, you might want to compare the output of a node after an initial run, then reconfigure the node with different property values to generate a second data set for comparison.
- To view a data set in a pop-out window, click the View in pop-out window button on the data viewer tab:
The data set is displayed in a pop-out window which you can move around the screen independently of the browser window. The tab is removed from the data viewer.
- To close the pop-out window and return the data to the data viewer, click the Dock back into main window button in the top right corner:Note: If you are viewing data in a pop-out window, you can add a Filter or Split node to your data flow provided that the source data remains available. If your source data is no longer available, you will see a warning at the bottom of the pop-out window. Any changes that you make to the source data while the pop-out window is open will not be reflected in the data viewer.
Data quality indicators
When you view a data set in the data viewer, colored bars indicating the quality of the data are displayed for each column.
The color and the length of the bar indicates the quality of the data, ranging from a dark green bar that runs the full width of the column to indicate the highest quality data, to a short, red bar to indicate the lowest quality data.
For Boolean, Integer / Long, Date / DateTime and Float / Double fields, quality is a measure of how many NULL values are found in the data set.
For String / Unicode fields, quality is a measure of how many NULL values, empty strings or values with leading or trailing white space are found in the data set.
When you hover over a data quality bar, a tooltip shows the quality as a percentage value.
You can click the data quality bar to display a histogram of the data values in the field and a number of statistics on the (sample) data shown in the data viewer, such as min, max, average, Std Dev. You can also display this dialog by choosing Statistics from the column menu.
You can change your view of the histogram by:
- Moving the ends of the gray bar below the histogram to zoom in on a section of the data values.
- Dragging the gray bar below the histogram to the left or right to view a different section of the histogram.
Copying data from the data viewer
You can copy content from the data viewer by pressing Ctrl+C, or by right-clicking and selecting Copy from the menu. You can then paste the selected data into an external text editor by pressing Ctrl+V.
- To copy the content of single cell, click the cell and press Ctrl+C. Or, to select multiple cells, click and drag over the cells that you want to select.
- To copy the content of an entire row of data, click the row number on the far left of the data viewer, then press Ctrl+C. To copy the content of an entire column, click the column header and press Ctrl+C.
- You can select all data in multiple adjacent columns by pressing Shift and clicking the column header of the first and last adjacent column that you want to select. You can select all data in multiple columns that are not adjacent by pressing Ctrl and clicking the column headers of the columns that you want to select. To select all data in the data viewer, press Ctrl+A, right-click and choose Select All, or click the empty cell in the top left corner of the data viewer.