FortiSIEM
FortiSIEM provides Security Information and Event Management (SIEM) and User and Entity Behavior Analytics (UEBA)
mbenvenuti
Staff
Staff
Article Id 372400
Description This article describes how to do advanced checks on the CMDB replication.
Scope FortiSIEM.
Solution

When the FortiSIEM is in DR mode and CMDB replication is not in sync from the Admin -> Health -> Replication menu, there are a few checks that can be done and some actions to remediate the issue like below:

 

  1. Check the replication state:

 

Make sure dr_slot1 is displayed on the primary CLI as root:


psql -U phoenix phoenixdb -c "select * from pg_replication_slots;"
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size
-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------
dr_slot1 | | physical | | | f | t | 2346399 | | | 44/A6BBFD80 | | reserved |

 

Make sure sync_state is 'streaming':


psql -U phoenix phoenixdb -c "select * from pg_stat_replication;"
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time
---------+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+--------------+-----------+-------------+-------------+-------------+-------------+-----------------+-----------------+-----------------+---------------+------------+-------------------------------
2346399 | 16385 | phoenix | walreceiver | super2_IP | | 57012 | 2024-12-16 12:35:35.640796+00 | | streaming | 44/A6C834A0 | 44/A6C834A0 | 44/A6C834A0 | 44/A6C834A0 | 00:00:00.000428 | 00:00:00.002012 | 00:00:00.002164 | 0 | async | 2025-01-16 17:06:19.600996+00

 

If the row is missing or the sync_state is not 'streaming', it means that replication is not ongoing properly.

 

  1. Check the connection: As mentioned in the article Technical Tip: How to setup Disaster Recover mode with verification steps, make sure that the connection is stable, the bandwidth is enough, and port 5432/tcp is opened on both sides.

 

From super 1 CLI as root:

 

nmap -p 5432 super2_address

PORT STATE SERVICE
5432/tcp open postgresql

 

From super 2 CLI as root:

 

nmap -p 5432 super1_address

PORT STATE SERVICE
5432/tcp open postgresql

 

When the connection is fixed, the replication should start again by itself. If /cmdb is growing and the secondary has been disconnected for a long time, it is recommended to stop the DR mode by removing the secondary node under Admin -> License -> Node (before disk usage gets 100% and crashes the primary).

 

  1. Rebuild the replication from scratch: If replication has been stopped for a long time, a snapshot has been applied on one of the machines, and removing the DR mode from the GUI is not recommended or manageable; it is still possible to recreate the replication from scratch from the secondary super CLI as the root user:

 

su admin -c "monctl stop"

su admin -c "phtools --stop ALL"

systemctl stop postgresql-13

# If necessary, perform a backup of the data directory <----- Check disk usage for /cmdb and space left in destination directory.

df -h /cmdb

df -h /tmp

tar -czvf /tmp/data_backup.tar.gz /cmdb/data

cd /cmdb/data/

rm -rf *

 

Replace 'primary' with the IP of the primary in the next command.

 

su postgres -c "pg_basebackup -h primary -D /var/lib/pgsql/13/data -U phoenix -v -P -R -X stream -c fast"

systemctl start postgresql-13

systemctl status postgresql-13

su admin -c "monctl start"

su admin -c "phtools --start ALL"

 

The pg_basebackup command can take some time along the amount of data to transfer.

 

Afterwards, repeat step 1 to check the replication state.