FortiSIEM Discussions
thiago_inorpel
New Contributor II

Shared store Workers problem

I'm having some problems that are affecting the normal operation of the workers. Some evidence is below: The first of them is related to the Shared Store: "Readers phRuleWorker pos 99.9969% more than 15% behind of Writer" and Event Pipeline: "Worker Upload Queue greater than 75MB". The latter I believe is a consequence of the first.

image.png

When analyzing the health of the Worker VM, the highlighted backend errors are in the image below, I highlight the large amount of counts for PH_PARSER_FILE_STAT_FAILURE and PH_EVT_HANDLER_ERR and PH_HTTP_CLIENT_CURL_ERROR, with the respective phParser, phEventHandler and phRuleWorker processes.

 

image.png

Performing a check on the phoenix.log file for the three types of errors described above, we have:
image.png


image.png

image.png

We currently use Clickhouse in our cluster, and we have 2 workers. Both Workers are having this issue in the Shared Store.

1 Solution
thiago_inorpel
New Contributor II

It was identified that the problem was mainly related to the cluster resources as a whole, as well as to the default values ​​involving event processing parameters, worker balancing and adjustment of the number of files in the queue.
After making the necessary changes, as well as adapting FortiSIEM to our EPS reality, the problem was solved.

View solution in original post

2 REPLIES 2
thiago_inorpel
New Contributor II

I noticed that many event upload files are being generated in /opt/phoenix/cache/parser/upload/evt on Worker 1. I changed the phoenix_config parameters:
[PARSER]
max_num_event_files=20000 (changed to 50000)
[BEGIN phEventPackager]
max_num_event_files=10000 (changed to 50000)
In order to have a higher limit of event files stored before new events are discarded, I kept observing but the problem persists.
In this directory, I manually removed a certain amount of files for testing, I noticed that the upload buffer quickly increases due to the amount of logs we receive. This problem is resulting in an EPS = 0 and consequently the Upload Buffer always full with a high queue.

image.png


Are there any other parameters in the Worker configuration file that can be adjusted to make this situation more balanced?

thiago_inorpel
New Contributor II

It was identified that the problem was mainly related to the cluster resources as a whole, as well as to the default values ​​involving event processing parameters, worker balancing and adjustment of the number of files in the queue.
After making the necessary changes, as well as adapting FortiSIEM to our EPS reality, the problem was solved.