FortiSIEM Discussions
vishal_rawat
New Contributor II

FortiSIEM 7.5.0 – Automated HA Design Across Two Sites (Supervisors, Keepers, Workers)

Customer has Existing setup as follows: Site1 : 1 Supervisor, 1 Keeper, 2 Workers (FortiSIEM 7.2.6)
We plan to upgrade the above setup to 7.5.0 then will add 2 Supervisor and 2 Keeper nodes.

We are planning an Automated High Availability (HA) deployment for FortiSIEM 7.5.0 for a customer with two sites. The proposed architecture includes 3 Supervisors with DB, 3 ClickHouse Keeper nodes, and 2 Worker nodes, with the following two placement options under consideration:

Option 1 • Site 1: 2 Supervisors, 3 Keepers, 2 Workers • Site 2: 1 Supervisor
Option 2 • Site 1: 2 Supervisors, 2 Keepers, 2 Workers • Site 2: 1 Supervisor, 1 Keeper

We would like guidance on the following:
1. Which of the above options is recommended as per FortiSIEM best practices?
2. Will either option support automated HA at the node level?
3. Will either option support automated HA at the site level?
Any recommendations or design considerations based on real-world deployments would be greatly appreciated.

2 REPLIES 2
vishal_rawat
New Contributor II

@Anthony_E , @Secusaurus 
Could you please help here?

Secusaurus
Contributor III

Hi @vishal_rawat,

 

You will need to define your goals and depending on your specific requirements, then select the product parts that you need to archive it.

 

So, what should work, if site 1 fails?

  • If you want the full system to work, you need at least one worker, and an amount of keeper- and supervisor-nodes to establish a majority
  • If you want read-only access, you need something that has data (worker) and a supervisor as frontend. A keeper is not relevant then
  • If you need a backup of your data, you need a supervisor (if you want to have the cmdb) and a worker (if you want the event db and do not use an archive), but you need a fast connection*
  • If you need something to scale and don't care about site 1 failure (e.g. you are in a hyperscaler cloud and get around that with a loadbalancer in front), you probably only put workers (with a new shard) to site 2

 

Automated HA will be in place automatically for all nodes that play a part in that, once configured. What happens exactly will be based on how the failure happens. If site 1 is not reachable from the customer, but still working correctly and communicating to site 2, then, there will no automated "failover".

For the full design/configuration, it's better to check the docs. But in general, once configured on the supervisor, every nodes knows how to behave.

 

 

*The issue with FortiSIEM is, that workers of the same shard need a high-speed connection. If multi-site means "internet inbetween", throughput and latency may affect the communication between the workers.

Which means: If you cannot have a high-speed connection between the two sites, you should not distribute workers of the same shard into the different sites, but use different shards on different sites. In case of an outage of site 1, you then won't have access to the data of the shard(s) placed in site 1, but all the new incoming data will still be stored, analyzed and processed, via shard 2 on site 2.

If you're looking for a real disaster recovery solution (site 1 burns down), you will need to use additional methods, e.g. an archive database on NFS.

 

 

Best,

Christian

FCX #003451 | Fortinet Advanced Partner
FCX #003451 | Fortinet Advanced Partner