FortiProxy
FortiProxy provides enterprise-class protection against internet-borne threats and Advanced Web Content Caching
jjdope
Staff
Staff
Article Id 421930
Description This article describes how FortiProxy/FortiGate performs High, Medium, and Low confidence SSN detection, and explains why the match count may not increase as expected even when multiple SSNs are present in the inspected content.
Scope FortiProxy, FortiGate.
Solution

When configuring a DLP sensor entry with this dictionary, administrators may notice that the match count does not increase as expected, even when multiple SSNs exist in the inspected content. This occurs because high-confidence matching requires more than just a regex hit.

 

Example entry:

 

edit "test-tac"
    config entries
        edit 2
            set dictionary "g-fg-usa-natl_id-ssn-dict-high"
            set count 3
        next
      end


Example data used for testing:

 

SSN
Robert Aragon 489-36-8350 4929-3813-3266-4295
Ashley Borden 514-14-8905 5370-4638-8881-3020
Thomas Conley 690-05-5315 4916-4811-5814-8111


Only one high-confidence match is logged, even though there are multiple valid SSNs.

 

The g-fg-usa-natl_id-ssn-dict-high dictionary requires three components to align for a high-confidence SSN match:

 

  • Regex Match: The SSN pattern must match valid U.S. SSN formatting.
  • Data Validation: The number must pass internal validation checks.
  • Match-Around / Contextual Validation: The SSN must appear near a context keyword such as 'SSN', 'SIN' - typically within 48 characters.

When reviewing the example data, the first SSN (Robert Aragon) appears close enough to the keyword SSN, allowing all three checks to pass.
However, the remaining SSNs do not fall within the required 48-character window of the context keyword.

 

This results in:

  • Regex match -> Yes.
  • Data validation -> Yes.
  • Context match -> Only once.

Because the dictionary entry requires count = 3 and only one high-confidence match is registered, FortiProxy does not block the traffic.

 

To achieve three high-confidence matches, each SSN must include the context indicator:

 

SSN Robert Aragon 489-36-8350 4929-3813-3266-4295
SSN Ashley Borden 514-14-8905 5370-4638-8881-3020
SSN Thomas Conley 690-05-5315 4916-4811-5814-8111

 

FortiProxy will then increment the match count to 3, and the DLP rule triggers as configured.

 

Workaround:

Configure another rule where the dictionary is set to medium confidence. That will block the traffic if the high-confidence one does not hit.

Contributors