Reshape - Data360_DQ+ - Latest

Data360 DQ+ Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 DQ+
Version
Latest
Language
English
Product name
Data360 DQ+
Title
Data360 DQ+ Help
Copyright
2024
First publish date
2016
ft:lastEdition
2024-07-09
ft:lastPublication
2024-07-09T15:09:58.774265

The Reshape node allows you to convert rows into columns or columns into rows.

Converting Rows into Columns

At times, you may find it useful to convert a field's records into fields themselves. This option allows you to do so, by selecting Fields to Group By and Fields to Move to Columns. Once Fields to Group By have been selected, Fields that have been moved to Columns that are part of the same group will appear in the same row.

For example, consider performing a Reshape on the following sample from a data set:

id

loan_amnt

term

int_rate

001

19200

60 months

13.99

002

7200

36 months

12.29

003

8000

36 months

14.65

004

8175

36 months

16.99

005

8000

36 months

13.99

Sample of Loan Data Set Before Reshape

Selecting term as the Field to Group By, id as the Field to Move to Columns, and 5 as the Number of Rows to Move to Columns would reshape the data set into the following:

loan_amnt

term

int_rate

id

id_2

id_3

id_4

id_5

7200

36 months

12.29

002

003

004

005

009

19200

60 months

13.99

001

014

101

089

045

Sample of Loan Data Set After Reshape - Grouped By a Single Field

In this reshaped data set, the first 5 ids that have a 36 month term have been turned into columns and the first 5 ids that have a 60 month term have been turned into columns. This is highlighted by the values in bold. Additionally, ids that were not listed in the Sample of Loan Data Set Before Reshape were found; for the sake of space, this example assumes that the data set is much larger than what is listed.

It is also important to note that the loan_amnt and int_rate values listed in each record are only applicable to the first id, because we have not grouped by these fields. To obtain records where loan_amnt, term, and int_rate all apply to the ids listed in each record, we would need to Group By multiple fields, which the Reshape node does allow.

Doing so could result in the following data set, where each id matches the loan_amnt, term, and int_rate in its record.

Sample of Loan Data Set After Reshape - Grouped By Multiple Fields

loan_amnt

term

int_rate

id

id_2

id_3

id_4

id_5

19200

60 months

13.99

001

7200

36 months

12.29

002

8000

36 months

14.65

003

8175

36 months

16.99

004

8000

36 months

13.99

005

149

Converting columns into rows

At other times, you may wish to convert columns into rows, which the Reshape node also allows.

For example, you may have a data set where records may have either an id or member_id, and you want to merge these columns into a single field comprised of rows from each field.

id

member_id

001

143

009

Example 1: Data Set Before Column to Row Reshape

Specifying id and member_id as Fields to Move to Rows and Name of New Field to Move Columns to as common_id would reshape the data set into the following:

common_id

001

143

009

Example 1: Data Set After Column to Row Reshape

Note that record ordering will following the order in which Fields to Move to Rows are listed in the Properties panel. Also note that blank/null records have been carried over into the common_id column; however, these could easily be removed by using a Filter node.

An example using multiple columns can further illustrate how converting columns to rows will reshape your data set. Consider the following sample.

state

Q1

Q2

Q3

Q4

Indiana

25

50

75

100

Maryland

50

100

150

200

California

500

1000

1500

2000

Wisconsin

100

200

300

400

Virginia

50

100

150

200

Illinois

1000

2000

4000

5000

Example 2: Data Set Before Column to Row Reshape

Specifying Q1, Q2, Q3, and Q4 as Fields to Move to Rows, creating a field named Amount for the Name of New Field to Move Columns to parameter, and creating a field named Quarter for the Name of New Field to Move Column Labels to parameter would result in the following data set.

state

Amount

Quarter

Indiana

25

Q1

Indiana

50

Q2

Indiana

75

Q3

Indiana

100

Q4

Maryland

50

Q1

Maryland

100

Q2

Example 2: Sample of Data Set After Column to Row Reshape

Note that the four quarter columns have been consolidated into a single column, and that the labels of these previous four columns are now used as the values for the column's records. As a result, each state now has four records - one for each quarter. Similarly, the values that filled the quarter columns before the reshape now populate the Amount column, each value corresponding to the quarter and state it was associated with before the reshape.

Also note that for brevity, only six records are shown in the table above. After this reshape, the full data set would actually contain 4 records for each state (one record for each quarter column), for a total of 24 records.