Two types of rule groups are available in Data360 DQ+:
- Data Quality Rule Groups - a top level folder, used to hold reusable rules.
- Script Rule Groups - a top level folder, used to hold reusable Script rules.
Defining data quality rule groups
A Data Quality Rule Group is a top level folder, used to hold reusable rules. When building your reusable rules, you must first create at least one Data Quality Rule Group.
Complete the following steps to add a Data Quality rule group:
- With your Rule Library open, select the Rules tab. The Rules tab is where you can create the different types of reusable rules, for use in an Analysis.
- Click New >Data Quality Rule Group.
- Complete the fields for the Data Quality Rule Group. The fields are described below.
- Click Accept to save your changes.
Display Name
The name of the rule group.
Description
A description for the rule group.
Preferred Result Field
When creating a Data Quality Rule Group, the Preferred Result Field is the parameter used to name the field that will hold the result of a reusable rule within an Analysis.
Preferred error reason field
When creating a Data Quality Rule Group, the Preferred Error Reason Field is the parameter used to name the field that will hold errors resulting from the evaluation of a reusable rule within an Analysis.
Semantic Type Identification Rule
Choose whether a semantic type should be identified for the fields that match the relevant rules in the rule group.
When the Semantic Type Identification Rule field is ticked, you can choose whether to include each rule that you define as part of a rule group as part of the semantic type identification, by using the Include this Check in Semantic Type Identification field. The field is enabled and ticked by default when the Semantic Type Identification Rule field is ticked.
Make sure the Semantic Type Identification Rule field is ticked , and complete the fields to add a Semantic Type Identification Rule.
Semantic type
The name of the semantic type that can be assigned to a field that meets the criteria in this rule group.
D3S
. Semantic Types beginning with D3S are reserved for Data360 DQ+ system use. This check is case-insensitive, so d3s
, d3S
, and D3s
are also all reserved.Field data type
The data type of the field. Choose one of the available Data360 DQ+ data types from the drop-down.
Selection threshold percent
The minimum percentage of values in a field that meet the data quality rules in the group for the field to be considered as a match for the semantic type.
Enter a value between 0 and 100.
Field name check
A regular expression that is applied to the field name. The field name must match the regular expression for the field to be considered as a match for the semantic type.
To define the regular expression:
- Click the Edit button. The Edit and Test Regular Expression dialog opens.
- Enter a regular expression in the Regular Expression field. You can test the regular expression by entering a test string and clicking Test.
To test for an exact match on a field name, you can type the name of the field as it appears in the data set. For example, if you want to consider only fields with a name of "Region", you can enter "Region" in the Regular Expression field. Note that the regular expression is case sensitive, so "region" would not be a match. To test for "region" or "Region", you would need to define the regular expression in a way that tests for both cases.
For more information about regular expression formatting, see Java Regex Help.
Placeholder Fields
Placeholder Fields enable the "reusability" of a Rule Library, by allowing you to define what types of fields the reusable rules contained within the Data Quality Rule Group can accept. Placeholder Fields can be given virtually any name, however their data types should match the data types of the fields you intend to evaluate with your rules.
Defining Script Rule Groups
A Script Rule Group is a top level folder, used to hold reusable Script rules. When building your reusable rules, you must first create at least one Script Rule Group.
Complete the following steps to add a Script rule group:
- With your Rule Library open, select the Rules tab. The Rules tab is where you can create the different types of reusable rules, for use in an Analysis.
- Click New >Script Rule Group.
- Complete the fields for the Data Quality Rule Group. The fields are described below.
- Click Accept to save your changes.
Display Name
The name of the script rule group.
Description
A description for the script rule group.
Copying rule groups
You can make a copy of a rule group in a rule library, or copy a rule group from one rule library to another.
Complete the following steps to copy a rule group.
- Navigate to the Rule Library that you want to copy a rule into.
- Click Edit >Edit Stage.
- Make sure a Rule Group is selected. You must already have at least one rule group in the rule library before you can copy an existing rule group into the rule library.
- Click Copy From, and select whether to copy a rule group from This Rule Library or Other Rule Library.
- If you selected Other Rule Library, find and select the Rule Library in the Copy From Other Rule Library dialog.
- Select the Rule Group that you want to copy.
- Leave the Rule selection box empty.
- Click Copy.
When a rule group is copied into a group, its name is preserved. If a rule group already exists with the same name, the copied rule group has _Copy1
appended to its display name.