FortiAnalyzer
FortiAnalyzer can receive logs and Windows host events directly from endpoints connected to EMS, and you can use FortiAnalyzer to analyze the logs and run reports.
bboudjema
Staff
Staff
Article Id 335817
Description

The article describes how to create a dataset using regular expressions based on syslog logs to make a report.

Scope FortiAnalyzer.
Solution

Glossary and terminology:

 

Regular expression (REGEX): Regular expressions (REGEX) are patterns used to match and manipulate text.

 

Prerequisites:

 

Explanations of the syslog log format:

 

A Syslog message typically consists of three main parts:

 

  1. PRI (Priority): Encoded as <n>, where n = (Facility * 8) + Severity. Example: <34> indicates a facility of 4 (Auth) and severity of 2 (Critical).
  2. HEADER: Contains the timestamp and hostname. Example: Aug 22 10:00:00 myhost.
  3. MSG (Message): Includes the actual log content. Example: sshd[12345]: Failed password for invalid user admin.

 

The MSG portion is the interesting part, where we need to extract the desired value from the text to construct the relevant SQL query for the dataset.

 

For this KB article, an example is shown. For the needs, it is necessary to adapt the SQL request (PostgreSQL).

 

Here is a list of important metacharacters to know when building a regular expression in PostgreSQL:

 

  1. .: Matches any single character except a newline.
  2. ^: Matches the start of a string.
  3. $: Matches the end of a string.
  4. *: Matches 0 or more occurrences of the preceding element.
  5. +: Matches 1 or more occurrences of the preceding element.
  6. ?: Matches 0 or 1 occurrence of the preceding element (makes it optional).
  7. []: Matches any one character within the brackets (e.g., [abc] matches 'a', 'b', or 'c').
  8. [^]: Matches any one character not within the brackets (e.g., [^abc] matches any character except 'a', 'b', or 'c').
  9. (): Groups patterns together and captures the matched text.
  10. |: Acts as a logical OR between patterns (e.g., a|b matches 'a' or 'b').
  11. \: Escapes a metacharacter to treat it as a literal character (example: \. matches a literal period).
  12. \d: Matches any digit (equivalent to [0-9]).
  13. \D: Matches any non-digit character.
  14. \w: Matches any word character (letters, digits, or underscore, equivalent to [a-zA-Z0-9_]).
  15. \W: Matches any non-word character.
  16. \s: Matches any whitespace character (spaces, tabs, etc.).
  17. \S: Matches any non-whitespace character.
  18. {}: Specifies the number of occurrences for the preceding element (e.g., a{3} matches 'aaa').

 

These metacharacters help you define precise patterns to search, extract, and manipulate text in PostgreSQL using regular expressions.

 

Step 1 – View and analyze the syslog log format.

 

Below is an example of a syslog log stored on the FortiAnalyzer database: (Go to syslog ADOM -> logView -> syslog).

 

1 2024-07-11T13:55:28+04:00 172.22.8.98 PulseSecure: - - - 2024-07-11 13:55:28 - ER-AUH-ISA10 - [1.5.95.123] Default Network::user.name@domain.com (SAM-Realm)[][] - Login succeeded for user.name@domain.com /SAM-Realm from xx.xx.xx.xx with Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36 Edg/126.0.0.0.

 

It can be asked to create a report table like the below one based on syslog logs.

 

bboudjema_0-1724406489297.png

 

In PostgreSQL, the REGEXP_MATCHES() function is used by FortiAnalyzer to search a string for a pattern specified by a regular expression and return the matches. The function returns an array of text, where each element is a substring that matches a parenthesized subexpression within the regular expression.

 

REGEXP_MATCHES (source_string, pattern, [flags]).

 

source_string: The string to be searched.

pattern: The regular expression pattern to match.

flags: (Optional) Modifiers that change the behavior of the regular expression. Common flags include:

  • 'i' Case-insensitive matching.
  • 'g': Global search (find all matches).

 

Step 2 - Create the SQL dataset query using the REGEX function.

 

Here is an example of a PostgreSQL dataset query that uses regular expressions: (Go to Reports -> Report Definitions -> Datasets, and select Create New).

 

SELECT

REGEXP_MATCHES(`msg`, '([\w]+-[\w]+-[\w]+) - \[[^\]]+\] Default Network::[^@]+@[^()]+') AS "Login Device",

REGEXP_MATCHES(`msg`, 'Default Network::(.*)\(.*\)\[') AS "Login User",

REGEXP_MATCHES(`msg`, '\[([\d\.]+)]') AS "Public IP",

REGEXP_MATCHES(`msg`, '([0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2})') AS "Login Time",

CASE

WHEN `msg` ~* 'Login succeeded' THEN 'Login succeeded'

ELSE 'Login failed'

END AS "Login Status"

FROM

$log

WHERE

$filter

GROUP BY

msg

ORDER BY

"Login Device"

 

The below regular expression is used to extract the 'Login User' from the msg text:

 

REGEXP_MATCHES(`msg`, 'Default Network::(.*)\(.*\)\[') AS "Login User",

 

How it works:

 

  • 'Default Network::' Look for this exact phrase in the text.
  • (.*): Captures anything that comes after 'Default Network::' until the first (.
  • \(.*\)\[: Ensures it matches a part of the text that has parentheses () and a square bracket [ after the username.

 

Purpose: It finds the user’s name in a message like 'Default Network::username (other details)[more info]' and extracts 'username' as the 'Login User'.

 

 

Notes:

  • If no matches are found, REGEXP_MATCHES() returns an empty set.
  • The function is usually used in the SELECT list or in a WHERE clause to filter rows based on pattern matching.

 

If any columns of the table are empty, it indicates that the regex is incorrect and needs to be adjusted. For example:

 

bboudjema_1-1724406588809.png

 

Related documents:

Creating datasets

SQL Query Documentation FortiAnalyzer 7.4.0

9.7. Pattern Matching #