HomeDocumentation and Guides

Predefined Policy Configuration

Optical Character Scan

Enable optical character scanning in the Content tab for a pre-defined policy to detect the text data from images. Check the optical character scanning checkbox to ensure that the text data from the images are extracted and scanned for violations based on the configured policy.


Exceptions to the Content Regular Expression

Predefined policies have prebuilt regular expressions, however, you can add exceptions to the regular expression. For example, in the case where a credit card policy is monitoring an environment but there are specific credit cards that do not need to be monitored, those specific credit card numbers could be added the regular expression exception.



The threshold is the number of occurrences (1-1000) of a search pattern required to be detected in a single document for an incident to be generated. For example, a credit card policy with a threshold of 10 would generate an incident only for documents containing 10 or more credit card numbers. The default number of matches is 1.


Tolerance (some are limited to the default)

Tolerance in a policy defined the preciseness of the match the regular expression is looking for. There are three levels of Tolerance: Lenient, Moderate and Strict.

  • Lenient Tolerance—produces more general matches and is most likely to produce "false positives". The policy will most likely search for any instance of the regular expression in any document.
  • Moderate Tolerance—narrows the matches found by performing some additional numeric and formatting validation. The policy might search for all instances of the match in any document but exclude specific formats of the match or require keywords within a certain proximity of the match.
  • Strict Tolerance—is even more restrictive than Moderate Tolerance by also testing for additional patterns in close proximity to the detected content.


Tolerance with Spreadsheets

In a spreadsheet, the pattern must be in the same row or column while in a document the pattern must be on the same line.


Proximity expressions can be used to narrow down the content that is monitored and exposed. While several different types of documents might contain the words "financial" or "bank" or "account" all separately in different areas of the document, the user might want to narrow down documents that are more likely to be sensitive such as financial records that contain these words in proximity to other phrases or words or perhaps a sequence of numbers. A bank account number, social security or birth date might all be useful proximity expression for a financial regex, for example.


Platform(s) to Monitor

The platforms in your environment are listed so you can choose to monitor all platforms or only certain platforms. If for example, you were creating a policy to monitor only documents stored in O365 but not Google, you would only select O365.

File Type

The product is able to scan files within an environment by the name and content of the file or solely by the name. Music and Video Files, or example, are only scanned by name as they do not contain text to be scanned. Standalone Images in png, jpeg, gif, and tiff formats are scanned, the text is extracted, and analysed for content violation. In many cases, it is beneficial to scan all file types to be sure that all violations are found; however, if there are environmental limitations on what file types are available to users, you can certainly narrow down by file type.



Attachments—(Salesforce and ServiceNow) When a file is attached to a field or other object, it is uploaded and stored in the platform and at that point becomes subject to monitoring by relevant policies.
Spreadsheets—Cloudlock examines the first 1,000 rows and 50 columns, and a maximum of 10,000 total cells in a single spreadsheet document. Blank cells are still counted as data and the value is "null".
PDFs—Cloudlock supports scanning of pdf for content and context only when digitally created. PDFs that are typically scanned in through a scanner which creates an "image" of the document can only be monitored for exposure and file name.
Zip Files—Only up to 100 of the files within a zip file are scanned and only up to 5 MB total of the zip file's contents are scanned. Cloudlock supports up to 10 levels of zip file nesting (a zip within a zip within a zip). Zip files are currently only supported in DLP policies.
Google Docs—In the Google platform, native Google Docs do not have "filetypes" per se (they have no filename extensions, for example), but they are monitored by Cloudlock. Only objects stored in Google Drive are monitored by Cloudlock. Gmail attachments that are not stored in Google Drive — like any other file or document stored outside Drive — are not monitored.
Embedded Images—Cloudlock examines the first 15 images embedded in a file. Although the entire text in a document will also be examined for DLP. Supported file types are Excel spreadsheets, PowerPoint presentations, Word documents, PDFs and ZIP files.

Ownership and Exceptions

The policy can be modified to monitor all users in the environment or specific users, groups or OUs that might own the files. This would be useful in a situation where only specific departments or offices needed to be monitored for a policy and not all users. Additionally, you can add exceptions to who is monitored. If the entire domain is worth monitoring but admins or executives do not need to be monitored, you can add their OU or Group as an exception.


Google Shared Folders

In the scenario where files from a user are shared to another user via a shared folder, only the original owner's files are scanned. Example: Jack has a shared folder in his My Drive called "Folder X" which he shares with Sally. Files in "Folder X" that Jack created himself will be scanned. However, files that Sally adds to "Folder X" or creates within the folder will NOT be scanned.

Exposure and Exceptions

Exposure is one of the most used features of Context policies. Exposure allows monitoring of what is exposed, how it is exposed, who did the exposing and what kind of exposure (public, private, domain-wide, etc). Exposure is broken up by platform so you can decide what platforms you wanted to be monitored for exposure. For example, if the majority of users are in a Google environment except for one department uses Box for storage, you could set your exposure to Box and choose all or some of the exposure option to alert whenever something is shared with that platform. You can also use Exposure to alert on any shares publicly through all licensed platforms or to alert on shares with one Group or OU to another. Like Ownership, Exposure also has exceptions. This is to allow sharing with specific Groups, OUs or Domains that might be validated within the company for exposure.

For more details on platform specifics and limitations, see Exposure by Platform.


Scanning with Exposure

When configuring a policy for exposure, it is important to keep in mind that whatever platforms are selected for exposure will be the ONLY platforms scanned. For example, if a policy is created and in the Platform section of the configuration Google, Office 365 and Slack are selected, but in the Exposure settings only Google and Slack are selected, the policy will only monitor Google and Slack.