FortiSIEM Discussions
KarlH
Contributor

Seeking some code review on parser xml code, failing testing. FortiSIEM 7.1

Hello, the below parser xml is failing testing on 7.1, any help is much appreciated. 

<eventFormatRecognizer
>

<![CDATA[.*Vendor _ATTACK\s+]]>
</eventFormatRecognizer>
<parsingInstructions>
<collectFieldsByRegex src="$_rawmsg">
<regex><![CDATA[.*Vendor Name {<_body:gPatMesgBody>}]]></regex>
</collectFieldsByRegex>
<setEventAttribute attr="eventType">VendorAlert</setEventAttribute>
<collectAndSetAttrByJSON src="$_body">
<attrKeyMap attr="accountid" key="Account Id"/>
<attrKeyMap attr="VendorAttackModule" key="Attack Module"/>
</collectAndSetAttrByJSON>
</parsingInstructions>
Karl Henning, Security Engineer, CISSP
Karl Henning, Security Engineer, CISSP
1 Solution
Rob_SIEM

Hi Karl,

 

The attached example parser worked for the full sample log you provided. 

 

The important part of this parser is this section:

  <!-- because our json is not ideally formed, you have to strip away the ["my value"] to achieve my value -->
  <setEventAttribute attr="msg">replaceStringByRegex($msg, ":\s*\"\[\\", ":")</setEventAttribute>
  <setEventAttribute attr="msg">replaceStringByRegex($msg, "\\\"\]\",", "\",")</setEventAttribute>
 
Remember that each of the values of the json object are strings, even though those strings are actually an array of json objects. So as a quick workaround we strip off the array designation.
 
"Account Id" : "[\"12345\"]" is a string, so we instead strip this down to "Account Id" : "12345" as a workaround. This isn't perfect however.
 
Ideally the format should have been:
"Account Id" : [ "account1", "account2" , account3" ]
instead of
"Account Id" : "[ \"account1\", \"account2\", \"account3\"] "
The restriction here is we treat entire value as one object instead of separate objects if there happens to be more than one. 
 
Thanks

View solution in original post

19 REPLIES 19
kcanalichio
New Contributor III

With out seeing the error message you get when you run the test or the raw event , there is not much anyone will be able to do for you here

KarlH

 

failed.png

 

You are right I should have been more complete in my reporting, Sadly the FortisSIEM  test tool knows it failed but neglects to share why.  Is there ever a time when the test tool actually tells one why it failed?  This is 7.1

Karl Henning, Security Engineer, CISSP
Karl Henning, Security Engineer, CISSP
kcanalichio
New Contributor III

The means that the the event didn't match the parser so your event format recognizer didn't match. The used Parser collumn will be filled with what parser its using.  one of the two below should work depending on how specific it need to match

 

<![CDATA[.*_ATTACK\s+]]>

 

<![CDATA[.*MORPHISEC_ATTACK\s+]]>  

 

KarlH

Ah ha so there is inference in that message you have decoded

 

so you are saying do not use

{<_body:gPatMesgBody>}

,,, ok thanks! let me try those and report back.

Karl Henning, Security Engineer, CISSP
Karl Henning, Security Engineer, CISSP
kcanalichio
New Contributor III

No using that is fine for the  parser body. Your inital problem is in this block. The event message has to match something in the block to use the parser.

 

But as far as you parser body goes...Unless the event has the string vendor in it won't parse anyhthing  or error out.

 

<eventFormatRecognizer>
<![CDATA[.*Vendor _ATTACK\s+]]>
</eventFormatRecognizer>

 

KarlH

Hi,

so the first couple lines of the log show :


2024-10-08 14:46:58 Vendor-EPTP INFO Vendor_ATTACK {"Account Id":"[\"0cebd16e-eba3-40a1-a2b6-88e9e20787d3\"]","Attack Module":"[\"kernel32.dll\"]","Attack Time":"[\"2024-10-08T14:46:49.769Z\"]","Code Processed":"[\"0x007ffcac8a8600 MOV RAX, RSP\"]","Command Line":"[\"C:/Users/shaned/Desktop/winwword.exe\"]","Computer
 
and we have :
<eventFormatRecognizer>
<![CDATA[.*Vendor_ATTACK\s+]]>
</eventFormatRecognizer>
 
 
We are already doing that and the test is failing
for the below code.
 
 
eventFormatRecognizer>
<![CDATA[.*VENDOR_ATTACK\s+]]>
</eventFormatRecognizer>
<parsingInstructions>
<collectFieldsByRegex src="$_rawmsg">
<regex><![CDATA[.*VENDOR_ATTACK {<_body:gPatMesgBody>}]]>
</regex></collectFieldsByRegex>
<setEventAttribute attr="eventType">VendorAlert</setEventAttribute>
<collectAndSetAttrByJSON src="$_body"><attrKeyMap attr="accountid" key="Account Id"/>
<attrKeyMap attr="VendorAttackModule" key="Attack Module"/>
</collectAndSetAttrByJSON></parsingInstructions>
 
Is it the wild card maybe? I've no idea about this syntax.. sorry
 
Karl Henning, Security Engineer, CISSP
Karl Henning, Security Engineer, CISSP
Rob_SIEM

Hi Karl,
 
The error you see here is in the matching of the event format recognizer. When you initialized the test, the parser expected the test log above to be matched to this parser, but because the event format recognizer regex did not match any portion of the log, it did not proceed to use this parser to parse the log, returning a failure.
 
<eventFormatRecognizer>
<![CDATA[.*Vendor _ATTACK\s+]]>
</eventFormatRecognizer>
 Looking at your log format snippet below (we can help if you attach a full sample log).
 
2024-10-08 14:46:58 Vendor-EPTP INFO Vendor_ATTACK {"Account Id":"[\"0cebd16e-eba3-40a1-a2b6-88e9e20787d3\"]","Attack Module":"[\"kernel32.dll\"]","Attack Time":"[\"2024-10-08T14:46:49.769Z\"]","Code Processed":"[\"0x007ffcac8a8600 MOV RAX, RSP\"]","Command Line":"[\"C:/Users/shaned/Desktop/winwword.exe\"]"}
 
 
For performance considerations it is best to try to match as close to the start of the log as possible, and as specific as possible to prevent issues with the parser incorrectly matching other logs that contain the given string. 
 
 
I would recommend this as a parser template. 
 
 
  <eventFormatRecognizer><![CDATA[<:gPatYear>-<:gPatMon>-<:gPatDay>\s+<:gPatTime>\s+<:gPatHostName>\s+<:gPatWord>\s+Vendor_ATTACK]]></eventFormatRecognizer>
 
  <parsingInstructions>
    <collectFieldsByRegex src="$_rawmsg">
      <regex><![CDATA[<_year:gPatYear>-<_mon:gPatMon>-<_day:gPatDay>\s+<_time:gPatTime>\s+(?:<reptDevIpAddr:gPatIpAddr>|<reptDevName:gPatHostName>)\s+<eventSeverityCat:gPatWord>\s+Vendor_ATTACK\s+<msg:gPatMesgBody>]]></regex>
    </collectFieldsByRegex>
 
<setEventAttribute attr="deviceTime">toDateTime($_mon, $_day, $_year, $_time)</setEventAttribute>
 
<!-- Give your event types clear names like vendor-model-eventType e.g. Microsoft-Office-LoginFailure -->
    <setEventAttribute attr="eventType">Vendor_Attack_Generic</setEventAttribute>
<setEventAttribute attr="eventSeverity">1</setEventAttribute>
 
    <collectAndSetAttrByJSON src="$msg">
      <attrKeyMap attr="tenantId" key="&quot;Account Id&quot;"/>
      <attrKeyMap attr="module" key="&quot;Attack Module&quot;"/>
  <!-- 2024-10-08T14:46:49.769Z  must be converted to epoch later in parser for deviceTime -->
      <attrKeyMap attr="_eventTime" key="&quot;Attack Time&quot;"/>
      <attrKeyMap attr="errorCode" key="&quot;Code Processed&quot;"/>
      <attrKeyMap attr="command" key="&quot;Command Line&quot;"/>
    </collectAndSetAttrByJSON>
 
<!-- because our json is not ideally formed, you have to strip away the ["my value"] to achieve my value -->
<when test="exist tenantId">
  <setEventAttribute attr="tenantId">replaceStringByRegex($tenantId, "[\[\]\"]+", "")</setEventAttribute>
</when>
<when test="exist module">
      <setEventAttribute attr="module">replaceStringByRegex($module, "[\[\]\"]+", "")</setEventAttribute>
</when>
 
    <when test="exist errorCode">
      <setEventAttribute attr="errorCode">replaceStringByRegex($errorCode, "[\[\]\"]+", "")</setEventAttribute>
</when>
 
    <when test="exist command">
      <setEventAttribute attr="command">replaceStringByRegex($command, "[\[\]\"]+", "")</setEventAttribute>
</when>
 
<when test="exist _eventTime">
  <switch>
<case>
  <!-- 2024-10-08T14:46:49.769Z  must be converted to epoch later in parser for deviceTime -->
  <collectFieldsByRegex src="$_eventTime">
<regex><![CDATA[<_year:gPatYear>-<_mon:gPatMon>-<_day:gPatDay>T<_time:gPatTime>\.\d+(?:<_tz:gPatTimeZone>)?]]></regex>
  </collectFieldsByRegex>
  
  <!-- Set Date Time -->
  <choose>
        <when test="exist _tz">
  <setEventAttribute attr="eventTime">toDateTime($_mon, $_day, $_year, $_time, $_tz)</setEventAttribute>
</when>
<otherwise>
  <setEventAttribute attr="eventTime">toDateTime($_mon, $_day, $_year, $_time)</setEventAttribute>
</otherwise>
  </choose>
</case>
<default/>
  </switch>
</when>
 
  </parsingInstructions>
 
 
I would recommend seeing if you can get the json log body in the unescaped format as well.  
//In this format, each of the arrays are treated as string literal - [\"0cebd16e-eba3-40a1-a2b6-88e9e20787d3\"] is a string not an array because of the outer double quotes
//Both are valid json, but one is technically incorrect
{
   "Account Id":"[\"0cebd16e-eba3-40a1-a2b6-88e9e20787d3\"]",
   "Attack Module":"[\"kernel32.dll\"]",
   "Attack Time":"[\"2024-10-08T14:46:49.769Z\"]",
   "Code Processed":"[\"0x007ffcac8a8600 MOV RAX, RSP\"]",
   "Command Line":"[\"C:/Users/shaned/Desktop/winwword.exe\"]"
}
In your sample log above, we cannot use array or subscript notation without massaging the data first
<attrKeyMap attr="tenantId" key="&quot;Account Id&quot;[0]"/>   -- will not work, as the value is a string
<attrKeyMap attr="tenantId" key="&quot;Account Id&quot;"/> -- will work, as the value is a string, but the value is ["0cebd16e-eba3-40a1-a2b6-88e9e20787d3"] when you just want 0cebd16e-eba3-40a1-a2b6-88e9e20787d3
 
//Proper json should look like this - you can see the nested arrays
{
   "Account Id":["0cebd16e-eba3-40a1-a2b6-88e9e20787d3"],
   "Attack Module":["kernel32.dll"],
   "Attack Time":["2024-10-08T14:46:49.769Z"],
   "Code Processed":["0x007ffcac8a8600 MOV RAX, RSP"],
   "Command Line":["C:/Users/shaned/Desktop/winwword.exe"]
 
First array element accessed of Account ID
<attrKeyMap attr="tenantId" key="&quot;Account Id&quot;[0]"/> -- get first array string element of account id and store in tenantId variable
 
 
Thanks,
 
Rob
 
Rob_SIEM

One additional note here, is the use of the escaped double quotes "&quot;" in my sample parser is actually optional.

 

<attrKeyMap attr="tenantId" key="&quot;Account Id&quot;"/>

and

<attrKeyMap attr="tenantId" key="Account Id"/>

 

are treated the same if we are not using dot notation or subscript operator.

 

If the key has whitespace or a literal '.' dot in the key name, you need to escape the key.

Example:

{ "my.key.name" : "my value" }

Needs

<attrKeyMap attr="_tempVar" key="&quot;my.key.name&quot;"/>

Example 2

{ "My Key Name" : [ "obj1", "obj2", "obj3" ] }

Needs this to access the second element

<attrKeyMap attr="_obj2TempVar" key="&quot;My Key Name&quot;[1]"/>

KarlH

HI Rob and thank you! Here is the full log, while I digest all that you have written :)

Do I sense that I may have to ask the vendor to try an clean up their JASON raw log? based on the comments it sounds like the raw log is not optimal.

 

Also I am having problems just validating small segments

I started with only adding this chunk of code but could not get it to validate

<eventFormatRecognizer>

<![CDATA[<:gPatYear>-<:gPatMon>-<:gPatDay>\s+<:gPatTime>\s+<:gPatHostName>\s+<:gPatWord>\s+Vendor_ATTACK]]>

</eventFormatRecognizer>

parse error.png

Error states Only one parsingInstructions tag is allowed.

 

2024-10-08 14:46:58 VENDOR-EPTP INFO VENDOR_ATTACK {"Account Id":"[\"0cebd16e-eba3-40a1-a2b6-88e9e20787d3\"]","Attack Module":"[\"kernel32.dll\"]","Attack Time":"[\"2024-10-08T14:46:49.769Z\"]","Code Processed":"[\"0x007ffcac8a8600 MOV RAX, RSP\"]","Command Line":"[\"C:/Users/shaned/Desktop/winwword.exe\"]","Computer Name":"[\"GWLT011-8884\"]","File Hash":"[\"\"]","File Name":"[\"\"]","Last Module Loaded":"[\"0x00007FFC99430000 | 0x00007FFC99462000 | 0x32000 | 0x20 | C:/WINDOWS/SYSTEM32/dbgcore.DLL (FileDescription:Windows Core Debugging Helpers;ProductName:Microsoft® Windows® Operating System;VersionInfo:10.0.22621.1 (WinBuild.160101.0800);Timestamp:Sun Mar 9 22:12:15 1980;ASLR:Enabled)\"]","Last Stack FunctionCall":"[\"kernel32.dll| 0x0000000000068600 ( WinExec) | 0x00007FFCAC840000\"]","Logged In UserName":"[\"GWNT/shane.dickson\"]","Message":"[\"VENDOR prevented a threat on application winwword\"]","VENDOR Version":"[\"8.3.3\"]","Parent Process Command Line":"[\"C:/Windows/explorer.exe\"]","Parent Signature":"[\"359179ffb630953ee79523866a0a2246a5612d726c2eace52f7413f15530715e\"]","Process Signature":"[\"d3d97b6af2457c9a8c43cb3856ff227802dc928836638d8e71258c9167379168\"]","Protector IP":"[\"10.1.11.35\"]","Tenant Id":"[\"69068d46-d11d-4e81-af93-0b91bec183ef\"]","Threat Description":"[\"VENDOR Total Evasion Framework is a tool that contains several Pen-testing attack techniques that bypass all AV and EDR solutions.\"]","Threat Module":"[\"Shellcode\"]","Threat Name":"[\"VENDOR Total Evasion Framework\"]","Threat Severity":"[\"%!s(int=5)\"]","Threat Sub-Classification":"[\"Attack-Simulator\"]"}
Event Attributes

 

Karl Henning, Security Engineer, CISSP
Karl Henning, Security Engineer, CISSP
Announcements

Welcome to your new Fortinet Community!

You'll find your previous forum posts under "Forums"