Rule groups - Data360_DQ+ - Latest

Data360 DQ+ Help

Product type
Software
Portfolio
Verify
Product family
Data360
Product
Data360 DQ+
Version
Latest
Language
English
Product name
Data360 DQ+
Title
Data360 DQ+ Help
Copyright
2024
First publish date
2016
ft:lastEdition
2024-07-09
ft:lastPublication
2024-07-09T15:09:58.774265

Two types of rule groups are available in Data360 DQ+:

  • Data Quality Rule Groups - a top level folder, used to hold reusable rules.
  • Script Rule Groups - a top level folder, used to hold reusable Script rules.

Defining data quality rule groups

A Data Quality Rule Group is a top level folder, used to hold reusable rules. When building your reusable rules, you must first create at least one Data Quality Rule Group.

Complete the following steps to add a Data Quality rule group:

  1. With your Rule Library open, select the Rules tab. The Rules tab is where you can create the different types of reusable rules, for use in an Analysis.
  2. Click New >Data Quality Rule Group.
  3. Complete the fields for the Data Quality Rule Group. The fields are described below.
  4. Click Accept to save your changes.
Tip: Data Quality Rule Groups should be organized around the semantic or logical type of the data being processed. They should not be organized around the Data Quality Rule Type being used within the Group.

Display Name

The name of the rule group.

Description

A description for the rule group.

Preferred Result Field

When creating a Data Quality Rule Group, the Preferred Result Field is the parameter used to name the field that will hold the result of a reusable rule within an Analysis.

Preferred error reason field

When creating a Data Quality Rule Group, the Preferred Error Reason Field is the parameter used to name the field that will hold errors resulting from the evaluation of a reusable rule within an Analysis.

Semantic Type Identification Rule

Choose whether a semantic type should be identified for the fields that match the relevant rules in the rule group.

When the Semantic Type Identification Rule field is ticked, you can choose whether to include each rule that you define as part of a rule group as part of the semantic type identification, by using the Include this Check in Semantic Type Identification field. The field is enabled and ticked by default when the Semantic Type Identification Rule field is ticked.

Make sure the Semantic Type Identification Rule field is ticked , and complete the fields to add a Semantic Type Identification Rule.

Semantic type

The name of the semantic type that can be assigned to a field that meets the criteria in this rule group.

Note: The value you provide for Semantic Type cannot begin with D3S. Semantic Types beginning with D3S are reserved for Data360 DQ+ system use. This check is case-insensitive, so d3s, d3S, and D3s are also all reserved.

Field data type

The data type of the field. Choose one of the available Data360 DQ+ data types from the drop-down.

Selection threshold percent

The minimum percentage of values in a field that meet the data quality rules in the group for the field to be considered as a match for the semantic type.

Enter a value between 0 and 100.

Field name check

A regular expression that is applied to the field name. The field name must match the regular expression for the field to be considered as a match for the semantic type.

To define the regular expression:

  1. Click the Edit button. The Edit and Test Regular Expression dialog opens.
  2. Enter a regular expression in the Regular Expression field. You can test the regular expression by entering a test string and clicking Test.

To test for an exact match on a field name, you can type the name of the field as it appears in the data set. For example, if you want to consider only fields with a name of "Region", you can enter "Region" in the Regular Expression field. Note that the regular expression is case sensitive, so "region" would not be a match. To test for "region" or "Region", you would need to define the regular expression in a way that tests for both cases.

For more information about regular expression formatting, see Java Regex Help.

Placeholder Fields

Placeholder Fields enable the "reusability" of a Rule Library, by allowing you to define what types of fields the reusable rules contained within the Data Quality Rule Group can accept. Placeholder Fields can be given virtually any name, however their data types should match the data types of the fields you intend to evaluate with your rules.

Defining Script Rule Groups

A Script Rule Group is a top level folder, used to hold reusable Script rules. When building your reusable rules, you must first create at least one Script Rule Group.

Complete the following steps to add a Script rule group:

  1. With your Rule Library open, select the Rules tab. The Rules tab is where you can create the different types of reusable rules, for use in an Analysis.
  2. Click New >Script Rule Group.
  3. Complete the fields for the Data Quality Rule Group. The fields are described below.
  4. Click Accept to save your changes.

Display Name

The name of the script rule group.

Description

A description for the script rule group.

Copying rule groups

You can make a copy of a rule group in a rule library, or copy a rule group from one rule library to another.

Complete the following steps to copy a rule group.

  1. Navigate to the Rule Library that you want to copy a rule into.
  2. Click Edit >Edit Stage.
  3. Make sure a Rule Group is selected. You must already have at least one rule group in the rule library before you can copy an existing rule group into the rule library.
  4. Click Copy From, and select whether to copy a rule group from This Rule Library or Other Rule Library.
  5. If you selected Other Rule Library, find and select the Rule Library in the Copy From Other Rule Library dialog.
  6. Select the Rule Group that you want to copy.
  7. Leave the Rule selection box empty.
  8. Click Copy.

When a rule group is copied into a group, its name is preserved. If a rule group already exists with the same name, the copied rule group has _Copy1 appended to its display name.