Command Metacharacters - spectrum_quality_1 - 23.1

Spectrum Data Quality Guide

Product type
Software
Portfolio
Verify
Product family
Spectrum
Product
Spectrum > Quality > Spectrum Quality
Version
23.1
Language
English
Product name
Spectrum Data Quality
Title
Spectrum Data Quality Guide
Topic type
Overview
Reference
Tips
How Do I
First publish date
2007
ft:lastEdition
2024-03-04
ft:lastPublication
2024-03-04T22:52:13.486265

Open Parser supports the standard set of Java RegEx character class metacharacters in the %Tokenize and @RegEx commands. A metacharacter is a character that carries special meaning in pattern matching. The supported metacharacters are:

([{\^-$|]})?*+.

There are two ways to force a metacharacter to be treated as an ordinary character:

  • Precede the metacharacter with a backslash
  • Enclose it within \Q (which starts the quote) and \E (which ends it).

%Tokenize follows the rule for Java Regular Expressions character classes—not Java Regular Expressions as a whole.

In general, the reserved characters for a character set are:

  • '[' and ']' indicate another set.
  • '-' is a metacharacter if in between two other characters.
  • '^' is a metacharacter if it is the first character in a set.
  • '&&' are metacharacters if they are between two other characters.
  • '\' means next that the character is a literal.

If you have any doubt whether a character will be treated as a metacharacter and you want the character to be treated as a literal, escape that character using the backlash.