String Masking using RegEx

Source code:


We are often asked to stop users from entering credit card numbers or filter inappropriate language from data entered by users. With the use of regular expressions and the code snippet below, this task becomes simple and quick to complete. Coupled with a before trigger, this code becomes an automated pattern driven masking tool that keeps watch over your data.


For this solution, let's assume that we have the need to mask Visa credit card numbers that are entered in Case Comments. Before we address the scenario, let's first look at the code used to mask the string.

[code apex] public class StringUtils { static public string MaskString(string inString, List inPatterns, string inMask, integer inVisibleCharacters) { // validate the passed in variables if (inString == null || instring.length() < 1 || inPatterns == null || inPatterns.size() < 1 || inMask == null) return inString; if (inVisibleCharacters < 0) inVisibleCharacters = 0; // prime the internal variables to be used during processing string stringToMask = inString; string maskedString = inString; // iterate through each pattern and mask any matches leaving the last visible characters for(string regEx : inPatterns) { Pattern p = Pattern.compile(regEx); Matcher m = p.matcher(stringToMask); while(m.find()) { // find the start and end indexes of the match integer startIdx = m.start(); integer endIdx = m.end(); // extract the matched string string patternMatch = stringToMask.substring(startIdx, endIdx); // mask the string leaving any visible characters string partToBeMasked = patternMatch.substring(0, patternMatch.length() - inVisibleCharacters); string mask = ''; for(integer i = 0; i < partToBeMasked.length(); i++) { mask += inMask; } // concatenate mask string with the last visible characters string maskedNumber = mask + patternMatch.substring(patternMatch.length() - inVisibleCharacters); // replace the the card number with masked number maskedString = maskedString.replace(patternMatch, maskedNumber); } } return maskedString; } } [/code] Now, we create the trigger needed to scrub the Case Comments body. For the code below, assume that we have placed the code snippet above in a class named StringUtils. [code apex] trigger MaskCaseComments on CaseComment (before insert, before update) { for(CaseComment cc : { if (cc.CommentBody != null) cc.CommentBody = StringUtils.MaskString(cc.CommentBody, new list{'4\\d{3}[- ]*\\d{4}[- ]*\\d{4}[- ]*\\d{4}'}, '*', 4); } } [/code] Now, each time a Case Comment is entered, the CommentBody value, if provided, will be scrubbed for Visa credit card numbers and masked with a star (*), leaving the last 4 characters. The test class below illustrates the masking results... [code apex] static private testMethod void testMaskStirng() { // create a pattern for matching, in this case we are looking for visa numbers list listPatterns = new list{'4\\d{3}[- ]*\\d{4}[- ]*\\d{4}[- ]*\\d{4}'}; Test.startTest(); // test to make sure the credit card number is masked leaving the last 4 digits System.Assert(StringUtils.MaskString('my visa number is 4111-1111-1111-1111', listPatterns, '*', 4) == 'my visa number is ***************1111'); Test.stopTest(); } [/code]


For this example, we used credit card masking as the scenario. By using regular expressions, this snippet of code is capable of masking a multitude of string values from words and sentences to numbers and symbols. When asked to mask values, regular expressions and this snippet are all you need.