Bob,
I made it two weeks and finally had a problem reported.
It' s quite an odd situation, in that the host uses a round-robin strategy for the DNS record (which happens to be a CNAME), and the TTL of the CNAME record is about 5 minutes and the A record is 20 seconds.
95% of the time, the client receives the complete resolution of the CNAME record to the A record' s IP without an issue (either from it' s local cache (ipconfig /displaydns), or from the local DNS server' s cache or resolution procedure).
If I trace a transaction ID of a failed lookup, I am able to see query packets being sent from the client (pcap), arriving and processing at the DNS server (DNS logs), the CNAME record is resolved to the A record, but the A record response is never received by the server. The server then goes out to the SOA servers for the domain to resolve the target of the CNAME (the A record).
The strategy of using such a low A record TTL with a higher CNAME record TTL really seems like a bad idea to me, particularly since the resource is an external host (at least the introduction of a CNAME and two varying TTL).
I' m having the person test by using the A record that' s the target of the CNAME to see if the problem continues to occur. (fyi: from a political/appearance stand point the A record name is within the domain of a swallowed up company, so they just " threw a CNAME at it." )
Thanks,
Matt
[UPDATE/CONCLUSION/SOLUTION]
Reviewing the Windows DNS Server logs, revealed the problem.
Time CNAME' s A records is asked for by the Windows DNS server:
20130227 12:59:02 (asking Fortigate' s Virtual Server Address)
20130227 12:59:08 (asking geographically_dispersed_SOA_server_1)
20130227 12:59:12 (asking geographically_dispersed_SOA_server_2)
20130227 12:59:16 (asking geographically_dispersed_SOA_server_3)
20130227 12:59:16 (asking geographically_dispersed_SOA_server_4)
20130227 12:59:23 (asking Fortigate' s Virtual Server Address)
[note that the time is 20 seconds, like the A record' s TTL]
Time responded to:
20130227 12:59:23 (answered by Fortigate' s Virtual Server Address)
Using: nslookup -qa=A -debug [FQDN of CNAME] [Fortigate' s Virtual Server address]
I' m able to see the SOA servers FQDNs and see the list of geographically dispersed SOA servers correlate.
So, although the Fortigate' s Virtual Server failed to (at least) return a response (from the public DNS server in it' s Real Server pool, maybe), the Windows DNS Server tried it' s secondary method of resolving the DNS
Problem: The Windows DNS server was only able to access port 53 of the Virtual Server address by firewall policy.
Solution: Allow the Windows DNS server to access any IP address over port 53 by firewall policy. Unfortunately this doesn' t answer why the Fortigate failed to deliver a lookup response in the first place, but it is a good workable solution.
You could implement two policies, and place the one with the destination IP of the Virtual Server before the policy targeting ' all' then review the policy hit ' count' column to see how common this " problem" is (for instance).
Bob: this may solve your issue.
Thanks,
Matt
" …you would also be running into the trap of looking for the answer to a question rather than a solution to a problem." - [link=http://blogs.msdn.com/b/oldnewthing/archive/2013/02/13/10393162.aspx]Raymond Chen[/link]