Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
kallbrandt
Contributor II

VIP/IP-Pools stops working - ARP issue? 800C HA, A-A, 5.2.13

Hello,

An odd error - A lot of services suddenly went offline yesterday evening at a client's datacenter. Almost nothing regarding NAT worked. Most of the VIPs was dead - The logs are empty! No traffic! (Lots of users, webpages etc. Incoming traffic 24/7.) Failing over to other fw makes it work for a while. Same with reboots. Editing the VIP, like changing the public IP and then save might make it work for a while. The same with IP-Pools - Changing the pool in any way makes it work, for a while. The only outgoing NAT that actually works all the time is the interface address. All virtual addresses are totally unreliable. No strange traffic or load of any kind.

 

ISP has no problems with routing, the prefixes are advertised, and we did a failover to backup router (VRRP/BGP) that's located in another DC - Same problem. Other vdoms has internet access and SNAT/DNAT also, and works. Other equipment (VPN-concentrator etc) works flawlessly, so think the ISP side of things are ok. Switches are ok.

 

execute clear system arp table

 

Did actually work a few times.

 

Any ideas gentlemen? A bit lost with this one...

 

(Will open a high prio case with TAC)

Richie

NSE7

Richie NSE7
14 REPLIES 14
kallbrandt

UPDATE:

Found the fault...

 

There are several vdoms. The latest one have internet-access too, just as the rest. But in this vdom, the VLAN-interface ARPs on EVERYTHING. You can ping just about all the unused addresses in the public /24, and it will answer! Show arp shows nothing. Doing arping from a linux machine on the public subnet shows the same thing - It answers to almost everything!

I tried to delete the interface, but then the config sync failed... Had to do a factory reset, then delete a bunch of polices in the vdom on the current master, then paste them back in on to get sync going again.

 

But, had to create the interface again due to short maintenance windows, without being able to reboot the master. Back to 0 really, interface behaves in the same way.

 

The vdom is in heavy use, so will probably try to setup another physical interface untagged instead and see if that works better.

Richie

NSE7

Richie NSE7
kallbrandt

UPDATE: Nothing works...

Changing physical interface, tagged/untagged... It seems the vdom is fundamentally broken in some way.

Have an escalated ticket now, but might have to "fix" the issue by deleting the public facing interface and route the traffic via another vdom instead.

Richie

NSE7

Richie NSE7
kallbrandt

Check this out:

 

host@nohost:~$ sudo arping 194.xxx.xxx.xxx ARPING 194.xxx.xxx.xxx 60 bytes from 00:09:0f:09:64:17 (194.xxx.xxx.xxx): index=0 time=5.723 msec 60 bytes from 00:09:0f:09:64:12 (194.xxx.xxx.xxx): index=1 time=5.810 msec 60 bytes from 00:09:0f:09:64:17 (194.xxx.xxx.xxx): index=2 time=12.204 msec 60 bytes from 00:09:0f:09:64:12 (194.xxx.xxx.xxx): index=3 time=12.299 msec

 

:64:12 in the interface with the actual IP-address set, :64:17 is the baaad interface. ARP-Poisoned by your own Fortigate, basically.

Richie

NSE7

Richie NSE7
Antonio_Milanese

Hi Richie,

 

odd to say the least..

 

are all vdoms in nat mode ? maybe an overlooked/missed/unused ippool overlapping ? this is the most ordinary hypothesis that come to my mind..

 

By the way, if it's confirmed as a bug, can you helping us to understand how it's eventually triggered, and which is your vdoms topology ? using hybrid vdoms half-numbered , ecc...

 

Thanks,

 

Antonio

 

romanr

Hey,

 

according to that MAC addresses both arp replies come from the root VDOM of you cluster.

 

I'd run a "diagnose debug report" into a text file and try to look if there are any run time references to this IP and MAC addresses... Maybe this will give you a hint.

 

Br,

Roman

Labels
Top Kudoed Authors