I have a VRRP related question which I trying to find some insights.
We have TWO Fortigate clusters (each cluster with two nodes in active/passive HA). Cluster A is the primary cluster and is supposed to do all the work, all the time...unless it fails completely (both nodes of said a/p cluster). Then the second cluster should take over. Our networks/VLANs are attached on BOTH of these clusters and we are using VRRP in order to give the servers and clients in those networks a single gateway ip (eg. 10.0.0.1/24).
The configuration of the vrrp is using virtual MACs and the first/primary cluster gets the second IP (eg 10.0.0.2 as well as the 10.0.0.1 because it is master) and the second/backup cluster gets the third IP (eg. 10.0.0.3).
When pinging from my fortigates I can reach all three IP addresses (10.0.0.1-3).
However, when coming from an outside network, I can only ping 10.0.0.1 and 10.0.0.2. The third IP (10.0.03) gives no answer.
What I saw while sniffing: The request do get to 10.0.0.3 (backup cluster), however are being answered by 10.0.0.2 (the primary cluster).
What am I missing? Is there a configuration that makes sure it works as intended (having a gateway IP and a master/backup situation), but also makes sure that I can ping the third IP address?
Thank you for your question. Do you have any debug flows from both devices? I would like to see what exactly is happening when you are trying to ping 10.0.0.3. In normal circumstances, secondary cluster should reply, because you are pinging it's interface IP. So If primary cluster is replying on it's behalf, there might be some config that is causing this (vip/ippool).
I would expect that kind of behavior if your pinging is coming from "outside" of 10.0.0.0/24. Because I would assume your ping packets are coming from another interface on the primary cluster, then it has to come through a policy allowing the incoming_interfac->the VLAN. I'm not sure (because I haven't sniffed after setting up a test environment) if it forwards the packet to the secondary cluster or not while the primary cluster knows that 10.0.0.3 is the VRRP's backup local IP. You mentioned you saw it was coming back from the primary. If that's the case probably because the primary didn't send the packet to the secondary.
But If it does forward it to the secondary, the secondary checks the return path and if it has own default route active or more specific matching route it would fail/drop because of "return path check, fail".
In either way, as Adrian said, I would run flow debugging on both clusters while you send the ping packets. And you should use a different interface/IP to reach the secondary cluster if you need an IP to management it from "outside".