BGP failover did not work as expected

I've got an interesting scenario for any folks using BGP on their FortiGates (or possibly any devices).



We have two ISPs, but the connections are highly asymmetrical so we investigated both BGP Conditional Advertisement and using AS-prepends to prevent traffic coming in the small connection unless we are truly down on our main ISP (we only have one /24, so we can't do any cool load-balancing stuff). In the end we chose AS-prepends because of faster waiting for the route to propagate. In theory (and in testing when we implemented), it worked great...if I shut down the primary ISP I see failover happening roughly as expected within 30-60 seconds.

That brings me to last week when our primary ISP did maintenance: code upgrades on their routers. We saw a complete outage for the entire time their router was down (about 10 minutes each reboot). No failover. Why? I have a theory, but I'm curious if anyone can validate this with personal experience.

I think it has to do with BGP Graceful Restart being enabled on my ISP's peering between them and their upstream providers... I think my route stayed alive between during the router reboot, and I'm not sure that any kind of change on my end (conditional advertisement for example) would help. Even if/when we get to having symmetrical connections, we still would lose all traffic from the upstream provider that thought they should still send it to this ISP instead of the other ISP. Has anyone encountered this as well? Is it just a necessary evil of network maintenance?

I have the same scenario setup in our environment but our provider is telling us they can't see the prepend being sent on their end (both routes look the same to them). 


So unfortunately, I can't validate what you're seeing, but I was wondering if you'd be willing to share what your AS-Prepend configuration looks like?  (If I can get our prepend working maybe I could help confirm your thought)


