Technical Tip: FortiGate DNS failover behavior when using 'failover' as server-select-method
| Description | This article describes how 'retry' and 'timeout' values impact the DNS failover when using 'failover' as the server-select-method. |
| Scope | FortiGate. |
| Solution | With the default configuration, the failover happens to the secondary DNS server after the firewall retries 2 times after every 5 seconds of timeout.
krbfgt (dns) # show full-configuration config system dns set timeout 5 set retry 2 end
Debugs:
Initial query to login.windows.net:
2025-06-02 13:49:20 [worker 0] dns_local_lookup()-2529: vfid=0, real_vfid=0, qname=login.windows.net, qtype=1, qclass=1, offset=35, map#=3 max_sz=131072 2025-06-02 13:49:20 [worker 0] dns_lookup_aa_zone()-608: vfid=0, fqdn=login.windows.net 2025-06-02 13:49:20 [worker 0] dns_send_request()-1430 2025-06-02 13:49:20 [worker 0] dns_send_resol_request()-1266: orig id: 0x005c local id: 0x005c domain=login.windows.net 2025-06-02 13:49:20 [worker 0] dns_find_best_server()-654: found server: 10.5.20.107 (vfid=0 vrf=0) <--
After 5 seconds:
2025-06-02 13:49:25 [worker 0] dns_retransmit_func()-1703: jiffies=212908 created=212408 wait_cat=0 wait_res=1 profile=last_tx=212408 ftg_last_tx=0 domain=login.windows.net (orig id: 0x005c local id:0x005c active) 2025-06-02 13:49:25 [worker 0] dns_send_request()-1430 2025-06-02 13:49:25 [worker 0] dns_send_resol_request()-1266: orig id: 0x005c local id: 0x005c domain=login.windows.net 2025-06-02 13:49:25 [worker 0] dns_send_resol_request()-1315: retransmission (domain=login.windows.net) 2025-06-02 13:49:25 [worker 0] dns_server_downgrade()-393: ip=10.5.20.107 encrypt=none rating=0 failure=0 last_failed=0
After 10 seconds from first query:
2025-06-02 13:49:30 [worker 0] dns_send_resol_request()-1266: orig id: 0x005c local id: 0x005c domain=login.windows.net 2025-06-02 13:49:30 [worker 0] dns_send_resol_request()-1315: retransmission (domain=login.windows.net) 2025-06-02 13:49:30 [worker 0] dns_server_downgrade()-393: ip=10.5.20.107 encrypt=none rating=0 failure=2 last_failed=406 2025-06-02 13:49:30 [worker 0] dns_find_best_server()-654: found server: 96.45.46.46 (vfid=0 vrf=0) 2025-06-02 13:49:30 [worker 0] dns_tcp_forward_request()-1111: vdom=root req_type=1 domain=login.windows.net 2025-06-02 13:49:30 [worker 0] dns_tcps_schedule_query_write()-374: orig id: 0x005c local id: 0x005c domain=login.windows.net mode=0 2025-06-02 13:49:30 [worker 0] dns_tcps_schedule_query_write()-395: schedule query (domain=login.windows.net) to connection 96.45.46.46:53 mode=0 <- Firewall doing the query to secondary
Firewall considers failover time=retry*timeout.
After the timeout is changed to 1, and the retry is 2 failover takes 2 seconds:
Initial query to api.akamai.com:
2025-06-04 17:41:47 [worker 0] dns_local_lookup()-2529: vfid=0, real_vfid=0, qname=api.akamai.com, qtype=1, qclass=1, offset=32, map#=3 max_sz=131072
After 1 second:
2025-06-04 17:41:48 [worker 0] dns_retransmit_func()-1703: jiffies=18887136 created=18887036 wait_cat=0 wait_res=1 profile=
After 2 seconds from the initial query:
2025-06-04 17:41:49 [worker 0] dns_retransmit_func()-1703: jiffies=18887243 created=18887141 wait_cat=0 wait_res=1 profile=
2025-06-04 17:41:49 [worker 0] tcp_handle_response()-164: domain=api.akamai.com (id=0x0044)
Related article: Technical Tip: Troubleshoot DNS high latency issue
|
