(Draft) Best Practices for the Data Loss Protection Policy

Some of the Data Loss Prevention policy's built-in identifiers have the potential to produce false positives if not customized to narrow the scope of the inspection. The following are recommended to reduce the number of false positives in these classifications. For more information on customizing an identifier, see Copy and Customize a Data Identifier.

The default threshold for built-in identifiers without tolerance is 1. This means that the policy will search for content where the identifier is met only once within a file. Increasing the threshold will scan content for instances where the identifier is met more than once, creating fewer false positives for each individual instance of the identifier. A threshold of 10, for example, only monitors or blocks a file if 10 instances of the identifier are found in the file.

Proximity Terms
Proximity keywords reduce false positives by requiring identifiers to match within 10 terms of specified words.
For example, a Canadian bank account identifier will search for a pattern matching Canadian bank account numbers and transit numbers. A document containing several random numbers matching that pattern could produce false positives, however, if proximity terms such as "Canada" and "bank" are added to the customized identifier, the scope of the inspection is reduced.

