Guides
ProductDeveloper
Guides

Custom Regular Expression Patterns

👍

You must create your own regex. The creation of custom regex is outside the scope of Umbrella Support.

Custom regular expression patterns for DLP data classifications support basic Java syntax with some limitations.

Table of Contents

Limitations

General

  • A regex can have a maximum of 1,000 characters.
  • A custom identifier can have a maximum of 10 regex.
  • A custom identifier can have a maximum of 100 entries.
  • The minimum length of a regex pattern is 3.
  • The maximum number of matches for a regex is 1000.

Regex Syntax

  • Anchor Flags
    • ^ can not be used as an input start position.
    • $ can not be used as an input end position.
  • Back References
    • \n cannot be used to reference a previous capture group n.

The following must be used for these character definitions:

  • Whitespace—'\u0020'
  • Dash—'\u002D'
  • Single quote—(char)0x0027
  • Double quote—'\u0022'

Regex Breadth

  • Wildcards
    • .* and .+ are not accepted.
    • ()* and ()+ are not accepted.
  • | alternations have a maximum of 20 per pattern.

Word Boundary

The regular expression is wrapped in a word boundary to ensure it matches the entire word. Words are comprised of alphanumeric character (a-z, A-Z, 0-9) and the boundary cannot contain a word character on each side of the boundary.
Example:
When the regular expression is cat, the sentence "The cat chased the mouse" will match, but "The catchased the mouse" will not match.


Create a Custom Identifier < Custom Regular Expression Patterns > Individual Data Identifiers