Repartition - Data360_DQ+ - Latest

Data360 DQ+ Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 DQ+
Version
Latest
Language
English
Product name
Data360 DQ+
Title
Data360 DQ+ Help
Copyright
2024
First publish date
2016
Last updated
2024-10-09
Published on
2024-10-09T14:37:51.625264

The Repartition node can be used to control how data is partitioned during Analysis execution.

Partition By Fields

This parameter allows you to choose which fields to use when repartitioning the data set.

Number of Partitions

This parameter allows you to choose how many partitions the data set should be divided into.

Repartitioning example

Suppose you had the following data set.

name

value

A

10

B

11

C

12

D

13

A

14

A

15

C

16

B

17

C

18

B

19

Were you to select name as a Partition By Field and specify 4 as the Number of Partitions, the Repartition node might produce the following result.

name

value

C

18

C

12

C

16

A

10

A

14

A

15

B

11

B

17

B

19

D

13

Within your result data set, records with similar Partition By Field values are placed within the same partition - that is, within close proximity of one another within the data set - in no particular order. Additionally, the specified Number of Partitions parameter matches the number of unique values within the Partition By Field.