This guide is a basic introduction to regular expressions (regex) as they are used in Cloudlock policies. It is intended to provide a starting point to using and interpreting regular expressions. There is much more to know about regex, however, and many resources available.
Cisco Cloudlock uses the Java implementation of regex. There are any number of others, which differ in some ways, but those are beyond the scope of this guide.
Regular expressions are used in Cisco Cloudlock policies to detect sequences of characters in documents and other objects stored in supported cloud platforms. Credit Card and Social Security Number policies use built-in, proprietary regular expressions (as well as other techniques) to find matching patterns. Custom regex policies can detect any pattern, but you must specify those patterns yourself.
As you use this guide, you can practice by using the regex tester built into the Cloudlock Policy tool:
You enter a regular expression as shown in #1, above, some test text as shown at #2, and any matches are highlighted at the bottom. There are similar regex testers available on the web — these are helpful too, but if you use one, make sure it supports Python-style regex.
Because regular expressions are so compact, they can be difficult to interpret, at least at first, and testing is important. A good process to follow in setting up a regex-based component in your cloud security monitoring program is to use a regex tester to refine your expression, then test it on sample data to make sure it identifies violations without flagging “false positives” — at least not to excess. Only when you’re satisfied with your test results should a regex be considered “ready to deploy” and placed into use.
Any single regular expression in Cisco Cloudlock is limited to 2048 characters (which would be an extremely long regex). If you find you’re getting close to the character limit, contact Cloudlock; we may be able to help achieve the same result with a smaller, more efficient regex.
While it can be essential to identify strings in documents and objects, you often need additional tools to make sure your monitoring system is pinpointing real issues. Cloudlock policies include tools that work in combination with regex to give the best results. These include:
- Exceptions — these are, in fact, regular expressions too. But in this case you can specify when “a match is not a match”. For example, if your regular expression is looking for exposed customer records, but you know there is one labeled “John Q. Customer” that’s just a sample, you can make that an exception so it doesn’t generate an incident.
- Proximity — It can be difficult to be sure that a particular set of characters really is what it might be. If it’s within a few characters of a label such as “DOB” or “Birthdate”, though, that can be a giveaway. Include another regex in the Cloudlock Proximity tool to help with this.
- Threshold — sometimes quantity matters. The threshold setting enables you to identify (or identify and flag as more significant) an object containing up to 1000 pattern matches.
Updated 5 months ago