Modify the Sensitive Data Dictionary

This article will show you how to modify the sensitive data dictionary that NG SAST uses when analyzing your code.

Detecting sensitive data leaks is a subscription-only feature; please reach out to ShiftLeft for more information.

Step 1: Create a New File to Hold New Definitions

To begin, use the ShiftLeft command-line interface to create a file to hold your newly defined dictionary. For this exercise, you'll create a file named filepath/my-app-dictionary.policy that uses the no-dictionary template provided by ShiftLeft:

sl policy create no-dictionary filepath/my-app-dictionary.policy

Make sure to change filepath to reflect the location you want to save your policy file.

Step 2: Define Your Dictionary

Once you've created my-app-dictionary.policy, you can begin adding your sensitive-data directives to the file. Such directives have the following form:

DATA $group = VAR $term1, ..., $term_n
ParameterDescription
$groupThe name of the sensitive data group, e.g., internalSecrets
$term1 ... $term_nKeywords to search for

Example

NG SAST's default policy contains the following directive to characterize highly-sensitive data:

DATA highlySensitive = VAR master key, cvv num, cvv, cvc num, cvc, encrypt key, crypt key

This directive instructs NG SAST to look for exact matches to the specified terms, any variations, and any combinations of those terms. You can add (or remove) additional terms to the directive instructing NG SAST to look for the presence of sensitive variables in your code.

Let's say that you want to return a limited number of data-sensitive Personal Identifying Information (PII) categories. You want to find names, email addresses, and phone numbers while ignoring all other categories. To do so, you can append the following example to the my-app-dictionary.policy file:

DATA pii = VAR first name, last name, middle name, middle initials, full name, maiden name, player name, family name
DATA pii = VAR email, email addr, email address, alternate email
DATA pii = VAR phone number, phone, mobile, landline number, home phone number, home phone num, office phone number, office phone num, alternate phone num, alternate phone number, phone number extension

The policy directive is, therefore:

IMPORT io.shiftleft/default
# PII
DATA pii = VAR first name, last name, middle name, middle initials, full name, maiden name, player name, family name
DATA pii = VAR email, email addr, email address, alternate email
DATA pii = VAR phone number, phone, mobile, landline number, home phone number, home phone num, office phone number, office phone num, alternate phone num, alternate phone number, phone number extension

Please note that:

  • Variables are case insensitive
  • Terms with spacing will match to alternative forms (e.g., first name will also find first_name, first-name, etc.)

Step 3: Validate Your Dictionary File

Once you've defined your dictionary, validate the file by running the following command:

sl policy validate my-app-dictionary.policy

If there are syntax or semantic issues with your policy, you will receive a non-zero exit status code in return.

Step 4: Uploading the Dictionary to ShiftLeft's Repository

Before you can use your dictionary, you must upload it to the ShiftLeft repository using sl policy push <policyLabel> <filepath>:

sl policy push myNewDictionary my-app-dictionary.policy

If your upload is successful, you'll get in return the full name under which the dictionary file is available, e.g., ebad68...ff7e/myNewDictionary:latest. Note that:

  • ebad68...ff7e is your ShiftLeft organization ID
  • myNewDictionary is the policy label
  • latest is the tag ShiftLeft assigned to the policy by default

Step 5: Assign the Dictionary

Assigning the dictionary to your application ensures that NG SAST uses it the next time it analyzes the app's code:

sl analyze --policy <policyLabel> --app <name>

Using the sample dictionary we created in this article, the sample command is therefore:

sl analyze --policy ebad68...ff7e/myNewDictionary --app myApp ~/path/to/app

At this point, you are ready to proceed with your next code analysis. In general, the number of sensitive-data categories in a modified dictionary is smaller than the default dictionary, so the number of results you see will likely be lower.

Conclusion

In this tutorial, you learned how to create a custom dictionary that identifies the sensitive data variables you provide and how to assign it for use with NG SAST.