FortiSIEM
FortiSIEM provides Security Information and Event Management (SIEM) and User and Entity Behavior Analytics (UEBA)
RuiChang
Staff
Staff
Article Id 418797
Description

 

This article describes a method to fix the etcd NOSPACE error.

 

Scope

 

FortiSIEM v7.3 and above.

 

Solution

 

In the FortiSIEM High Availability (HA) environment for v7.3 and above, etcd is introduced for the automation failover. In some cases, etcd may stop working, and replication will stop as well. The user may experience license expired warning on the follower node. In that case, the user needs to check the health of etcd via the command below:

 

etcdctl endpoint status --cluster -w table

 

Sample output:

 

+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|        ENDPOINT        |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| http://10.5.8.x:2379 | b0e9ae9f045235bd |  3.5.13 |   2.1 GB |      true |      false |         2 |     441985 |             441985 | memberID:xxxx NOSPACE|
| http://10.5.8.x:2379 | cff64fd3c11ba5af |  3.5.13 |   2.1 GB |     false |      false |         2 |     441985 |             441985 | memberID:xxxx NOSPACE|
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

 

If the errors column shows error 'NOSPACE', it indicates the etcd replication has exceeded the default quota-backend-bytes of 2GB. It can be resolved by cleaning the space and recovering the cluster health:

 

etcdctl defrag --command-timeout=10m --cluster
etcdctl alarm disarm

 

After that, the error should be resolved, and the database size will reduce as well:

 

etcdctl endpoint status --cluster -w table

 

+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|        ENDPOINT        |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| http://10.5.8.167:2379 | b0e9ae9f045235bd |  3.5.13 |   49 MB |      true |      false |         2 |     441985 |             441985 |        |
| http://10.5.8.166:2379 | cff64fd3c11ba5af |  3.5.13 |   49 MB |     false |      false |         2 |     441985 |             441985 |        |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

 

Monitor for a few minutes, and all the services should be back to normal.