This scenario is relevant for Active-passive HA with SDN connector failover deployment. See this document for more information on this deployment.
The interface flaps can be identified using the following logs:
- Under the ha history, continuous ha port flaps will be observed:
<2024-09-27 05:02:40> port port2 link status changed: 0->1 <2024-09-27 05:02:40> port port1 link status changed: 0->1 <2024-09-27 05:02:39> port port2 link status changed: 1->0 <2024-09-27 05:02:38> port port1 link status changed: 1->0 <2024-09-27 05:01:40> port port2 link status changed: 0->1 <2024-09-27 05:01:40> port port1 link status changed: 0->1
- Under the HA log, similar logs will populate:
date=2024-09-27 time=04:52:40 eventtime=1727430760307635869 tz="-0500" logid="0108035013" type="event" subtype="ha" level="error" vd="root" logdesc="HA failover failed" msg="azd failed to add public ip in nic azprf-fortigate-fw-FGT-A-Nic1" date=2024-09-27 time=04:51:40 eventtime=1727430700226158351 tz="-0500" logid="0108035013" type="event" subtype="ha" level="error" vd="root" logdesc="HA failover failed" msg="azd failed to add public ip in nic azprf-fortigate-fw-FGT-A-Nic1" date=2024-09-27 time=04:50:41 eventtime=1727430640819825851 tz="-0500" logid="0108035013" type="event" subtype="ha" level="error" vd="root" logdesc="HA failover failed" msg="azd failed to add public ip in nic azprf-fortigate-fw-FGT-A-Nic1" date=2024-09-27 time=04:49:40 eventtime=1727430580400252451 tz="-0500" logid="0108035013" type="event" subtype="ha" level="error" vd="root" logdesc="HA failover failed" msg="azd failed to add public ip in nic azprf-fortigate-fw-FGT-A-Nic1"
- From the crash log, the azd process will be bringing down the interface:
16380: 2024-09-27 04:45:40 Interface port2 is brought up. process_id=2424, process_name="azd" 16381: 2024-09-27 04:46:38 Interface port1 is brought down. process_id=2424, process_name="azd" 16382: 2024-09-27 04:46:38 Interface port2 is brought down. process_id=2424, process_name="azd" 16383: 2024-09-27 04:46:40 Interface port1 is brought up. process_id=2424, process_name="azd" 16384: 2024-09-27 04:46:40 Interface port2 is brought up. process_id=2424, process_name="azd"
- If the above logs match, enable the debug for the SDN connector by using the following commands:
diagnose debug application azd -1
diagnose debug enable
Verify if similar API failure logs are populating:
azd api failed, url = https://management.azure.com/subscriptions/ec162d24-afb6-4bb7-9c1d-6b73b5e13791/resourceGroups/hrcaz... , rc = 404 {"error":{"code":"ResourceNotFound","message":"The Resource 'Microsoft.Network/publicIPAddresses/AZPR-nuance-lb' under resource group 'hrcaz-pr' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix"}}
The log hrcaz-pr contains the resource group where the firewall is using an API query to obtain the IP address information of AZPR-nuance-lb.
To resolve this issue, it is necessary to map the correct resource group by making changes either on the FortiGate or on Azure.
CLI commands to change the resource group for the Public IP on the Fortigate Firewall for which the errors are populating:
config system sdn-connector
edit "AzureHA"
config nic
edit "FGT-FGT-A-Nic1"
config ip
edit "ipconfig1"
set public-ip "FGTPublicIP"
set resource-group ''
next
edit "Test_IP"
set public-ip "AZPR-nuance-lb"
set resource-group '<resource_group_name as on Azure>'
next
end
next
end
|