The Discovery Center uses patterns to describe the character shape of a data value. Each pattern represents the data as a series of codes. In the Discovery Center, users reference patterns to quickly identify deviations from the norm when analyzing data.
When you add a new repository, one of the parameters you define is the data profiling pattern. There are six profiling patterns available:
- Default
- Rich
- Long
- Greek
- Hebrew
- Turkish
How to Read Patterns
In the default pattern encoding, a means alpha, d means digit, p means punctuation, and so on. Wherever duplicate character codes occur in sequence (such as aaaa for four alphabetic characters in a row), a number is used to indicate how many alphabetic letters occur. In this case, the "aaaa" pattern is represented as "a4". For example, for the default pattern, the data value "Jane Smith" is represented as "a4_a5".
Click the following links to see more about pattern codes and examples.
- Default
-
The default pattern represents data in shorthand notation by counting the number of characters represented by the default pattern code and displaying that count next to the code. This pattern is useful for identifying the shapes of names, addresses, dates, and simple numbers, such as post codes. This pattern style does not distinguish between upper- and lowercase letters.
Note: The codes and examples also apply to the Greek, Hebrew, and Turkish patterns.- Default pattern code descriptions
-
Pattern Code
Represents a
Alpha character
d
Digit
p
Punctuation
_
Space
z Null . Unprintable ! Non-ASCII C. Carriage return
- Default pattern code examples
-
Value Pattern Wendell Crawford
a7_a8
5.00E+02
dpd2apd2
$400.00
pd3pd2
07/31/2017
d2pd2pd4
liz_smith@abc.com a3pa5pa3pa3
- Rich
-
The rich pattern represents data in shorthand notation by counting the number of characters represented by the rich pattern code and displaying that count next to the code. It is useful for identifying the shapes of names, addresses, dates, currency, and simple numeric values that include plus (+) or minus (-) signs. This pattern style distinguishes between upper- and lowercase letters.
- Rich pattern code descriptions
-
Pattern Code
Represents l
Lowercase alpha
u
Uppercase alpha
d
Digit
p
Punctuation
q Apostrophe (’)
Double quotes (")
Single quote (‘)
S Symbol _ Space . Unprintable m Currency + Plus sign - Minus sign or dash
- Rich pattern code examples
-
Value Pattern Wendell Crawford
ul6_ul7
5.00E+02
dSd2u+d2
$400.00
md3Sd2
07/31/2017
d2Sd2Sd4
liz_smith@abc.com l3Sl5pl3Sl3
- Long
-
The long pattern code uses a long-hand notation that identifies a character as either alphabetic (alpha) or numeric (digit), and displays all other characters exactly as they display in the value. For example, Jane Smith is represented as aaaa aaaaa, but $440.40 is represented as $nnn.nn.
- Long pattern code descriptions
-
Pattern Code Represents A
Alpha
N
Digit
Explicit
All characters that are not alpha or numeric display as they appear in the value.
- Long pattern code examples
-
Value Pattern Wendell Crawford
AAAAAAA AAAAAAAA
5.00E+02
N.NNA+NN
$400.00
$NNN.NN
07/31/2017
NN/NN/NNNN
liz_smith@abc.com AAA_AAAAA@AAA.AAA
- Greek, Hebrew, Turkish
-
These patterns use the same codes as the default pattern. See Default.