Description |
This article describes how to add a worker Fails with:
Operation failed. to add worker caused by Socket Timeout: Connection refused (Connection refused).
Or
Operation failed. error = Failed to add worker with error: Unknown code 1.
Or Operation failed. Error Socket channel has reached end of stream. |
Scope | Supervisor cluster (FortiSIEM versions 6.4.x - 6.7.x) with NFS. |
Solution |
1) Check the connectivity from a worker and vice-versa with the commands below:
- From a worker:
nc -vz superIP 443
- From supervisor:
nc -vz workerIP 443
The correct output will look like the following:
root@FortiSIEM-Worker ~]# nc -vz 192.168.1.234 443 Ncat: Version 7.70 ( https://nmap.org/ncat ) Ncat: Connected to 192.168.1.234:443.
2) If everything is alright with connectivity, check the NFS settings.
- NFS Server exports list in /etc/exports should follow:
/FortiSIEM SupervisorIPAddress(rw,sync,no_root_squash) /FortiSIEM WorkerIPAddress(rw,sync,no_root_squash)
- It is advisable to verify that the NFS Server has been configured in accordance with this documentation:
Note: NFSv4.2 is a newer version of the NFS protocol and it provides improved performance for large files, I/O-intensive workloads, and other additional features. While not all NFS servers may support NFSv4.2 yet, if the NFS server does support it, it is generally recommended to use it instead of NFSv4.1. To use NFSv4.2, it is necessary to change the mount option in the client’s /etc/fstab file to nfsvers=4.2.
2.1) In case the remote NFS filesystem cannot be mounted on the supervisor and worker, errors may be seen in phoenix.log (in the /opt/phoenix/log directory).
Here is an example of this error:
Supervisor:
2023-02-24T14:45:35.196516-05:00 super phMonitorSupervisor[1152741]: [PH_MONITOR_NOTIFICATION_RETURN_FAILURE]:[eventSeverity]=PHL_ERROR,[procName]=phMonitorSupervisor,[fileName]=phMonitorProcess.cpp,[lineNumber]=11975,[destIpAddr]=10.40.40.100,[xmlBody]=<TEST_STORAGE type="nfs"><server_ip>10.40.40.3</server_ip><mount_point>/fortisiem</mount_point></TEST_STORAGE>,[errorNo]=18,[phLogDetail]=phNotification returns failure
Worker:
2023-02-24T14:45:35.196516-05:00 worker phMonitorWorker[2457]: [PH_UTIL_CMD_FAILURE]:[eventSeverity]=PHL_ERROR,[procName]=phMonitorWorker,[fileName]=phMonitorProcess.cpp,[lineNumber]=10679,[command]=python /opt/phoenix/deployment/jumpbox/datastore.py nfs test 10.40.40.3 /fortisiem online,[errorNoInt]=18,[phLogDetail]=Failed to run command
The following commands can be used to identify if the error is present:
- From super:
grep PH_MONITOR_NOTIFICATION_RETURN_FAILURE /opt/phoenix/log/phoenix.log
- From a worker:
grep PH_UTIL_CMD_FAILURE /opt/phoenix/log/phoenix.log
To troubleshoot the issue, follow these steps: 1) First, confirm there are not an NFS entry manually added to /etc/fstab
2) Next, check the NFS configuration file on the super node by running cat /etc/fstab. Mount the NFS share on the worker using the same method as specified in the super’s /etc/fstab file: mount -v -t nfs 10.40.40.3:/fortisiem /data.
3) If errors are encountered while the mount, check if the NFS Server exports list (/etc/exports) has the correct syntax. Make sure there are not any spaces before the brackets process and proceed to the next step.
4) If there are no errors during the mounting process, proceed to the next step.
5) Confirm that the NFS share was mounted correctly by running mount -t nfs4 or mount -t nfs.
6) Ensure that the /data folder is accessible and has correct permissions by running ls -l /data.
Here is an example:
[root@FSIEM-Super~]# ls -l /data total 44 drwxrwxr-x 3 postgres postgres 4096 Mar 31 2022 archive drwxr-xr-x 2 root root 4096 Mar 31 2022 backup drwxr-xr-x 5 admin admin 4096 May 13 2022 cache drwxr-xr-x 2 postgres postgres 4096 Mar 31 2022 cmdb drwxr-xr-x 2 admin admin 4096 Jul 13 2022 custParser drwxrwxr-x 2 admin admin 4096 Dec 28 10:47 eventDataSum drwxr-xr-x 8 admin admin 4096 May 3 2022 eventdb drwxr-xr-x 2 admin admin 4096 Jun 17 2022 jmxXml drwxr-xr-x 2 admin admin 4096 Jun 17 2022 mibXml drwx------ 3 admin admin 4096 Mar 31 2022 precomputedb drwxr-xr-x 2 admin admin 4096 Mar 31 2022 reportdb
7) Test if it is possible to write a file to /data drive with the command below:
touch /data/test.txt
If the command failed, more likely permission issues are present.
8) If there is a file named /opt/phoenix/.nfs.json, remove it.
9) Additionally, if there is a license file, remove it as well: /etc/opsd/.fortisiem4x0.
10) Reboot the worker node and double-check that no drives are mounted by running df -h.
11) Finally, go to the GUI and attempt to add the worker node again. |