Technical Tip: Database replication fails due to stopped MySQL service
Description
This article describes how to fix an issue where a High Availability (HA) setup undergoes failover and database replication does not work because MySQL is not starting on the Secondary Server.
Scope
FortiNAC.
Solution
To verify mysql state in both Primary and Secondary Servers, type:
service mysqld status
Example of MySQL not starting:
service mysqld status
Redirecting to /bin/systemctl status mysqld.service
● mysqld.service - MySQL Community Server
Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
Active: activating (start-post) since Tue 2019-10-15 11:06:22 CDT; 2min 57s ago
Process: 7388 ExecStart=/usr/bin/mysqld_safe (code=exited, status=0/SUCCESS)
Process: 7377 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
Main PID: 7388 (code=exited, status=0/SUCCESS); : 7389 (mysql-systemd-s)
CGroup: /system.slice/mysqld.service
└─control
├─7389 /bin/bash /usr/bin/mysql-systemd-start post
└─8297 sleep 1
Redirecting to /bin/systemctl status mysqld.service
● mysqld.service - MySQL Community Server
Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
Active: activating (start-post) since Tue 2019-10-15 11:06:22 CDT; 2min 57s ago
Process: 7388 ExecStart=/usr/bin/mysqld_safe (code=exited, status=0/SUCCESS)
Process: 7377 ExecStartPre=/usr/bin/mysql-systemd-start pre (code=exited, status=0/SUCCESS)
Main PID: 7388 (code=exited, status=0/SUCCESS); : 7389 (mysql-systemd-s)
CGroup: /system.slice/mysqld.service
└─control
├─7389 /bin/bash /usr/bin/mysql-systemd-start post
└─8297 sleep 1
This article provides steps to restart MySQL and database replication.
Login to the Secondary Server CLI as root and perform the following steps:
- Kill the Management Process PID.
shutdownCampusMgr –kill
- Stop MySQL.
service mysqld stop
- In /var/lib/mysql/ directory, delete only the bsc sub-directory, ib* and mysqld-relay-bin.* files.
rm -rf /var/lib/mysql/bsc
rm /var/lib/mysql/ib*
rm /var/lib/mysql/mysqld-relay-bin.*
rm /var/lib/mysql/mysql-bin.* (FortiNAC F)
- Verify the server-id value in /etc/my-runtime.cnf is different from that on the Primary Server. If the same, modify the value (refer to comments in file for suggested values).
- On the Primary Server, stop control and management processes. Run the following:
shutdownCampusMgr
Wait ten to fifteen seconds, then run the following:
shutdownCampusMgr –kill
Wait a further ten to fifteen seconds, then run the following:
service mysqld restart
startupCampusMgr
- On the Secondary Server, restart MySQL and verify the state:
service mysqld restart
service mysqld status
MySQL is running when the Active status displays 'Active: active (running)'.
- On the Secondary Server, restart replication:
hsStartSQLReplication <Primary IP address> bsiadmin bra26153
- On the Secondary Server, start the Management Process:
startupCampusMgr
- Wait several minutes, then validate that database replication is working. Enter the following:
hsIsSlaveActive
If the response contains the line slave is active, database replication is working.
Alternatively, tail the processManager logs after starting the Management Process and look for 'slave is active':
tail -F /bsc/logs/output.processManager
Note: It can take several attempts before replication starts working, so be patient. Errors may also print, but that does not necessarily mean the replication will fail.
If several database replication attempts have been made in the logs and the status remains slave is inactive, further investigation is required. In both primary and secondary control servers, run the grab-log-snapshot script and provide resulting files to Support. For instructions, refer to the related article below.
If several database replication attempts have been made in the logs and the status remains slave is inactive, further investigation is required. In both primary and secondary control servers, run the grab-log-snapshot script and provide resulting files to Support. For instructions, refer to the related article below.
Note:
- From ticket 9223583, running FortiNAC in v9.4.3 - the command in step 7 should not include 'bsiadmin bra26153' anymore. It should be run only as:
hsStartSQLReplication <Primary IP address>
- From ticket 10414826, running FortiNAC v9.4.5 and later versions FortiNAC F - Mysql-relay-bin file is called mysql-bin.index
- The same procedure can be applied to the Primary device while keeping the Secondary node down. See Technical Tip: MySQL service failed to start on Primary appliance.
Related articles:
- Technical Tip: Full disk causes database replication to fail
- Technical Tip: How to Use grab-log-snapshot
- Technical Tip: MySQL service failed to start on Primary appliance
- Technical Tip: Rebuild mysql directory on secondary server
- Technical Tip: How to fix DB synchronization on a secondary FortiNAC server with error: 'Access denied for user 'bsiadmin'@'localhost' (using password: YES)'
