FortiSIEM Discussions
thiago_inorpel
New Contributor II

Shared store Workers problem

I'm having some problems that are affecting the normal operation of the workers. Some evidence is below: The first of them is related to the Shared Store: "Readers phRuleWorker pos 99.9969% more than 15% behind of Writer" and Event Pipeline: "Worker Upload Queue greater than 75MB". The latter I believe is a consequence of the first.

image.png

When analyzing the health of the Worker VM, the highlighted backend errors are in the image below, I highlight the large amount of counts for PH_PARSER_FILE_STAT_FAILURE and PH_EVT_HANDLER_ERR and PH_HTTP_CLIENT_CURL_ERROR, with the respective phParser, phEventHandler and phRuleWorker processes.

 

image.png

Performing a check on the phoenix.log file for the three types of errors described above, we have:
image.png


image.png

image.png

We currently use Clickhouse in our cluster, and we have 2 workers. Both Workers are having this issue in the Shared Store.

1 REPLY 1
thiago_inorpel
New Contributor II

I noticed that many event upload files are being generated in /opt/phoenix/cache/parser/upload/evt on Worker 1. I changed the phoenix_config parameters:
[PARSER]
max_num_event_files=20000 (changed to 50000)
[BEGIN phEventPackager]
max_num_event_files=10000 (changed to 50000)
In order to have a higher limit of event files stored before new events are discarded, I kept observing but the problem persists.
In this directory, I manually removed a certain amount of files for testing, I noticed that the upload buffer quickly increases due to the amount of logs we receive. This problem is resulting in an EPS = 0 and consequently the Upload Buffer always full with a high queue.

image.png


Are there any other parameters in the Worker configuration file that can be adjusted to make this situation more balanced?