FortiSIEM Discussions
KarlH
Contributor

Seeking some code review on parser xml code, failing testing. FortiSIEM 7.1

Hello, the below parser xml is failing testing on 7.1, any help is much appreciated. 

<eventFormatRecognizer
>

<![CDATA[.*Vendor _ATTACK\s+]]>
</eventFormatRecognizer>
<parsingInstructions>
<collectFieldsByRegex src="$_rawmsg">
<regex><![CDATA[.*Vendor Name {<_body:gPatMesgBody>}]]></regex>
</collectFieldsByRegex>
<setEventAttribute attr="eventType">VendorAlert</setEventAttribute>
<collectAndSetAttrByJSON src="$_body">
<attrKeyMap attr="accountid" key="Account Id"/>
<attrKeyMap attr="VendorAttackModule" key="Attack Module"/>
</collectAndSetAttrByJSON>
</parsingInstructions>
Karl Henning, Security Engineer, CISSP
Karl Henning, Security Engineer, CISSP
1 Solution
Rob_SIEM

Hi Karl,

 

The attached example parser worked for the full sample log you provided. 

 

The important part of this parser is this section:

  <!-- because our json is not ideally formed, you have to strip away the ["my value"] to achieve my value -->
  <setEventAttribute attr="msg">replaceStringByRegex($msg, ":\s*\"\[\\", ":")</setEventAttribute>
  <setEventAttribute attr="msg">replaceStringByRegex($msg, "\\\"\]\",", "\",")</setEventAttribute>
 
Remember that each of the values of the json object are strings, even though those strings are actually an array of json objects. So as a quick workaround we strip off the array designation.
 
"Account Id" : "[\"12345\"]" is a string, so we instead strip this down to "Account Id" : "12345" as a workaround. This isn't perfect however.
 
Ideally the format should have been:
"Account Id" : [ "account1", "account2" , account3" ]
instead of
"Account Id" : "[ \"account1\", \"account2\", \"account3\"] "
The restriction here is we treat entire value as one object instead of separate objects if there happens to be more than one. 
 
Thanks

View solution in original post

19 REPLIES 19
Rob_SIEM

Hi Karl, some of the errors are not the best in some cases. I'll see if I can request some improvements in this area to better tell you what the error is.

 

At a minimum the parser needs these blocks:

1) eventFormatRecognizer  -- Small snippet of regex used to determine if this parser should be used to parse the entire log. Typically this is the unique log header combination of this vendors log. When an event comes in, the parsers are evaluated in order, and each eventFormatRecognizer regex is matched against the log. The first match reached, tells parser process to use that parser for the log. The pattern here is not designed to parse anything, just a regex match. We make use of predefined regex patterns in FortiSIEM to make matching easy. For example the pattern gPatTime is the regex that matches 00:00:00 

 

2) parsingInstructions - Once the parser is matched, the instructions here are used to actually parse the log. Start by separating the header values from the body of the log. Then parsing the body of the log. You can clone any other parser to use as a reference to see functions used, and parser flow. Some are simple, some are complex. 

 

We do have a NSE Parser Training course for FortiSIEM I'd highly recommend as well. It goes over log structure types, and methodologies used to parse them. 

 

We have a number of functions nested inside regex that can be confusing the first time.

<_someVar:gPatInt> -- these blocks use a global regex pattern (gPatInt is just \d+) and stores the var in _someVar temporary variable.

Conversely, <srcIpAddr:gPatIpAddr> - maps the regex for IP address if matched to the Source IP Address event attribute. 

 

<eventFormatRecognizer><![CDATA[<:gPatYear>-<:gPatMon>-<:gPatDay>\s+<:gPatTime>\s+<:gPatHostName>\s+<:gPatWord>\s+Vendor_ATTACK]]></eventFormatRecognizer>

<parsingInstructions>

   <setEventAttribute attr="eventType">Vendor_Attack_Generic</setEventAttribute>

</parsingInstructions>

Rob_SIEM

One additional question. What is the product and log source of these logs? How are these getting shipped into FortiSIEM? 

KarlH

HI Rob

I have

<eventFormatRecognizer>
<![CDATA[ MORPHISEC_ATTACK]]>
</eventFormatRecognizer>
<parsingInstructions>
<collectFieldsByRegex src="$_rawmsg">
<regex><![CDATA[.*MORPHISEC_ATTACK\s+ {<_body:gPatMesgBody>}]]></regex>
</collectFieldsByRegex>
<setEventAttribute attr="eventType">MorphisecAlert</setEventAttribute>
<collectAndSetAttrByJSON src="$_body">
<attrKeyMap attr="accountid" key='"Account Id"'/>
<attrKeyMap attr="morphisecAttackModule" key='"Attack Module"'/>
</collectAndSetAttrByJSON>
</parsingInstructions>

Do actually need to enable the parser during testing there is no other parser that would capture this string, the raw log shows I think developing this thing in baby steps is the best approach.
2024-10-08 14:46:58 MORPHISEC-EPTP INFO MORPHISEC_ATTACK {"Account Id":"[\"0cebd16e-eba3-40a1-a2b6-88e9e20787d3\"]","Attack Module":"[\"kernel32.dll\"]","Attack Time":"[\"2024-10-08T14:46:49.769Z\"]","Code Processed":"[\"0x007ffcac8a8600 MOV RAX, RSP\"]","Command Line":"[\"C:/Users/shaned/Desktop/winwword.exe\"]","Computer Name":"[\"GWLT011-8884\"]","File Hash":"[\"\"]","File Name":"[\"\"]","Last Module Loaded":"[\"0x00007FFC99430000 | 0x00007FFC99462000 | 0x32000 | 0x20 | C:/WINDOWS/SYSTEM32/dbgcore.DLL (FileDescription:Windows Core Debugging Helpers;ProductName:Microsoft® Windows® Operating System;VersionInfo:10.0.22621.1 (WinBuild.160101.0800);Timestamp:Sun Mar 9 22:12:15 1980;ASLR:Enabled)\"]","Last Stack FunctionCall":"[\"kernel32.dll| 0x0000000000068600 ( WinExec) | 0x00007FFCAC840000\"]","Logged In UserName":"[\"GWNT/shane.dickson\"]","Message":"[\"MORPHISEC prevented a threat on application winwword\"]","MORPHISEC Version":"[\"8.3.3\"]","Parent Process Command Line":"[\"C:/Windows/explorer.exe\"]","Parent Signature":"[\"359179ffb630953ee79523866a0a2246a5612d726c2eace52f7413f15530715e\"]","Process Signature":"[\"d3d97b6af2457c9a8c43cb3856ff227802dc928836638d8e71258c9167379168\"]","Protector IP":"[\"10.1.11.35\"]","Tenant Id":"[\"69068d46-d11d-4e81-af93-0b91bec183ef\"]","Threat Description":"[\"MORPHISEC Total Evasion Framework is a tool that contains several Pen-testing attack techniques that bypass all AV and EDR solutions.\"]","Threat Module":"[\"Shellcode\"]","Threat Name":"[\"MORPHISEC Total Evasion Framework\"]","Threat Severity":"[\"%!s(int=5)\"]","Threat Sub-Classification":"[\"Attack-Simulator\"]"}
Event Attributes

I cannot see why it would not capture that string in eventFormatRecognizer

Karl Henning, Security Engineer, CISSP
Karl Henning, Security Engineer, CISSP
Rob_SIEM

Correct, the flow is validate -> test -> enable -> Save 

Then on the parser screen, click the "apply" button to push the parser change out to all collectors to take effect. New events coming in should now match the parser you applied. 

 

When you are testing the parser, those changes dont take effect for events coming in until saved and applied.

 

Thanks,

KarlH

No I just meant to test it.. it does not have to be enabled, I did not know about the sequence validate -> test -> enable -> Save  I thought I had to save my changes first then test.   For Testing  it sounds like it still traverses all the other parsers if it cant work with mine,  and just does not stop at my regexe did you see the actual rawe log I posted .  I don't think the vendor is going to do any clean up of the log.

<![CDATA[ Morphisec-EPTP INFO MORPHISEC_ATTACKK]]>
 Even when I place the whole actual string Morphisec-EPTP INFO MORPHISEC_ATTACK  
like so
<eventFormatRecognizer>
<![CDATA[ Morphisec-EPTP INFO MORPHISEC_ATTACK ]]>
</eventFormatRecognizer>
<parsingInstructions>
<collectFieldsByRegex src="$_rawmsg">
<regex><![CDATA[ Morphisec-EPTP INFO MORPHISEC_ATTACK {<_body:gPatMesgBody>}]]></regex>
</collectFieldsByRegex>
<setEventAttribute attr="eventType">MorphisecAlert</setEventAttribute>
<collectAndSetAttrByJSON src="$_body">
<attrKeyMap attr="accountid" key='"Account Id"'/>
<attrKeyMap attr="morphisecAttackModule" key='"Attack Module"'/>
</collectAndSetAttrByJSON>
</parsingInstructions>

it still does no find the parser.

 

Karl Henning, Security Engineer, CISSP
Karl Henning, Security Engineer, CISSP
Rob_SIEM

Btw for this, event attributes must be defined in FortiSIEM first.

<attrKeyMap attr="accountid" key='"Account Id"'/>
<attrKeyMap attr="morphisecAttackModule" key='"Attack Module"'/>

accountId is defined, but it is case sensitive. accountid will result in an error.


If you defined a custom FortiSIEM event attribute in FortiSIEM (programmatic name "morphisecAttackModule"), you can use it, but if not. You can just mark it as a temp variable. _myTempVar

<attrKeyMap attr="accountId" key="Account Id"/>
<!-- temp vars will not show up in GUI, only usable for other logic in FSM
<attrKeyMap attr="_morphisecAttackModule" key="Attack Module"/>

instead consider using an existing attribute such as:

<attrKeyMap attr="accountId" key='"Account Id"'/>
<attrKeyMap attr="module" key='"Attack Module"'/>

 

We have the full sample log, we can make you a sample parser in a few days as time allows. 

KarlH

So in the Event Attributres tab the 'accountid' was already in there as lower case, its display name is Account id, 

 

and yes I created 4 custom attributes and all are Strings value types. these were very unique fields in the log, like Code Processed and Attack Module.

 

Very much appreciate the assist,

 

Karl Henning, Security Engineer, CISSP
Karl Henning, Security Engineer, CISSP
KarlH

I would welcome any code you could provide.

 

Again, Thank you. 

Karl Henning, Security Engineer, CISSP
Karl Henning, Security Engineer, CISSP
Rob_SIEM

Hi Karl,

 

The attached example parser worked for the full sample log you provided. 

 

The important part of this parser is this section:

  <!-- because our json is not ideally formed, you have to strip away the ["my value"] to achieve my value -->
  <setEventAttribute attr="msg">replaceStringByRegex($msg, ":\s*\"\[\\", ":")</setEventAttribute>
  <setEventAttribute attr="msg">replaceStringByRegex($msg, "\\\"\]\",", "\",")</setEventAttribute>
 
Remember that each of the values of the json object are strings, even though those strings are actually an array of json objects. So as a quick workaround we strip off the array designation.
 
"Account Id" : "[\"12345\"]" is a string, so we instead strip this down to "Account Id" : "12345" as a workaround. This isn't perfect however.
 
Ideally the format should have been:
"Account Id" : [ "account1", "account2" , account3" ]
instead of
"Account Id" : "[ \"account1\", \"account2\", \"account3\"] "
The restriction here is we treat entire value as one object instead of separate objects if there happens to be more than one. 
 
Thanks
KarlH
Contributor

Hello,@R

 

When I tried to validate in FortiSIEM 7.1.3 it complained about line 62.   What version did the parser you so kindly created successfully validate in?

 

<parsingInstructions>
<collectFieldsByRegex src="$_rawmsg">
<regex><![CDATA[<_year:gPatYear>-<_mon:gPatMon>-<_day:gPatDay>\s+<_time:gPatTime>\s+(?:<reptDevIpAddr:gPatIpAddr>|<reptDevName:gPatHostName>)\s+<eventSeverityCat:gPatWord>\s+VENDOR_ATTACK\s+<msg:gPatMesgBody>]]></regex>
</collectFieldsByRegex>

<setEventAttribute attr="deviceTime">toDateTime($_mon, $_day, $_year, $_time)</setEventAttribute>

<!-- Give your event types clear names like vendor-model-eventType e.g. Microsoft-Office-LoginFailure -->
<setEventAttribute attr="eventType">Vendor_Attack_Generic</setEventAttribute>
<setEventAttribute attr="eventSeverity">1</setEventAttribute>

<!-- because our json is not ideally formed, you have to strip away the ["my value"] to achieve my value -->
<setEventAttribute attr="msg">replaceStringByRegex($msg, ":\s*\"\[\\", ":")</setEventAttribute>
<setEventAttribute attr="msg">replaceStringByRegex($msg, "\\\"\]\",", "\",")</setEventAttribute>

<collectAndSetAttrByJSON src="$msg">
<attrKeyMap attr="accountId" key="Account Id"/>
<attrKeyMap attr="module" key="Attack Module"/>
<!-- 2024-10-08T14:46:49.769Z must be converted to epoch later in parser for deviceTime -->
<attrKeyMap attr="_eventTime" key="Attack Time"/>
<attrKeyMap attr="errorCode" key="Code Processed"/>
<attrKeyMap attr="command" key="Command Line"/>
<attrKeyMap attr="hostName" key="Computer Name"/>
<attrKeyMap attr="targetHashCode" key="File Hash"/>
<attrKeyMap attr="fileName" key="File Name"/>
<attrKeyMap attr="startModule" key="Last Module Loaded"/>
<attrKeyMap attr="funName" key="Last Stack FunctionCall"/>
<attrKeyMap attr="user" key="Logged In UserName"/>
<attrKeyMap attr="details" key="Message"/>
<attrKeyMap attr="version" key="VENDOR Version"/>
<attrKeyMap attr="parentCommand" key="Parent Process Command Line"/>
<attrKeyMap attr="hashCode" key="Process Signature"/>
<attrKeyMap attr="parentFileHashCode" key="Parent Signature"/>
<attrKeyMap attr="_protectorIp" key="Protector IP"/>
<attrKeyMap attr="tenantId" key="Tenant Id"/>
<attrKeyMap attr="virusId" key="Threat Id"/>
<attrKeyMap attr="description" key="Threat Description"/>
<attrKeyMap attr="threatCategory" key="Threat Module"/>
<attrKeyMap attr="virusName" key="Threat Name"/>
<attrKeyMap attr="_severity" key="Threat Severity"/>
<attrKeyMap attr="subtype" key="Threat Sub-Classification"/>
</collectAndSetAttrByJSON>

<when test="exist _eventTime">
<switch>
<case>
<!-- 2024-10-08T14:46:49.769Z must be converted to epoch later in parser for deviceTime -->
<collectFieldsByRegex src="$_eventTime">
<regex><![CDATA[<_year:gPatYear>-<_mon:gPatMon>-<_day:gPatDay>T<_time:gPatTime>\.\d+(?:<_tz:gPatTimeZone>)?]]></regex>
</collectFieldsByRegex>

<!-- Set Date Time -->
<choose>
<when test="exist _tz">
<setEventAttribute attr="eventTime">toDateTime($_mon, $_day, $_year, $_time, $_tz)</setEventAttribute>
</when>
<otherwise>
<setEventAttribute attr="eventTime">toDateTime($_mon, $_day, $_year, $_time)</setEventAttribute>
</otherwise>
</choose>
</case>
<default/>
</switch>
</when>

<when test="exist _severity">
<switch>
<case>
<!-- %!s(int=5) -->
<collectFieldsByRegex src="$_severity">
<regex><![CDATA[\(int=<eventSeverity:patSeverity>\)]]></regex>
</collectFieldsByRegex>
</case>
<default/>
</switch>
</when>


<when test="exist user">
<switch>
<case>
<!-- DOMAIN/user -->
<collectFieldsByRegex src="$user">
<regex><![CDATA[^\s*<domain:patExceptSlash>/<user:gPatStr>$]]></regex>
</collectFieldsByRegex>
</case>
<default/>
</switch>
</when>

<when test="exist _protectorIp">
<switch>
<case>
<collectFieldsByRegex src="$_protectorIp">
<regex><![CDATA[<hostIpAddr:gPatIpAddr>]]></regex>
</collectFieldsByRegex>
</case>
<default/>
</switch>
</when>

</parsingInstructions>

Karl Henning, Security Engineer, CISSP
Karl Henning, Security Engineer, CISSP