Troubleshooting Tip: etcd NOSPACE error
Description
This article describes a method to fix the etcd NOSPACE error.
Scope
FortiSIEM v7.3 and above.
Solution
In the FortiSIEM High Availability (HA) environment for v7.3 and above, etcd is introduced for the automation failover. In some cases, etcd may stop working, and replication will stop as well. The user may experience a license expired warning on the follower node. In that case, the user needs to check the health of etcd via the command below:
etcdctl endpoint status --cluster -w table
Sample output:
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | http://10.5.8.x:2379 | b0e9ae9f045235bd | 3.5.13 | 2.1 GB | true | false | 2 | 441985 | 441985 | memberID:xxxx NOSPACE| | http://10.5.8.x:2379 | cff64fd3c11ba5af | 3.5.13 | 2.1 GB | false | false | 2 | 441985 | 441985 | memberID:xxxx NOSPACE| +------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
If the errors column shows error 'NOSPACE', it indicates the etcd replication has exceeded the default quota-backend-bytes of 2GB. It can be resolved by cleaning the space and recovering the cluster health:
etcdctl defrag --command-timeout=10m --cluster
etcdctl alarm disarm
After that, the error should be resolved, and the database size will reduce as well:
etcdctl endpoint status --cluster -w table
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS | +------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+ | http://10.5.8.167:2379 | b0e9ae9f045235bd | 3.5.13 | 49 MB | true | false | 2 | 441985 | 441985 | | | http://10.5.8.166:2379 | cff64fd3c11ba5af | 3.5.13 | 49 MB | false | false | 2 | 441985 | 441985 | | +------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
If the above steps do not help to reduce the etcd database size, follow this article: Technical Tip: How to compact ETCD revision to gain space.
Monitor for a few minutes, and all the services should be back to normal. If issues remain, contact support for further investigation on etcd.
