Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
Dipen
New Contributor III

OSPF Route Learning in HA

We have two FG3600C Appliances in HA. The Firmware is 5.0.5. We are running OSPF where the immediate neighbor is HP 12508 Switch. The Master node is learning all routes from Switch. When we do a HA Testing i.e. bring down one of the monitored ports of Cluster. The failover is immediate. However the Slave Device takes lot of time to rebuild the OSPF routing table. It takes approx. 15 ping timeouts. is this default behavior. We want to reduce this. Why does Slave appliance take so much time to learn OSPF Routes.

Ahead of the Threat. FCNSA v5 / FCNSP v5

Fortigate 1000C / 1000D / 1500D

 

Ahead of the Threat. FCNSA v5 / FCNSP v5 Fortigate 1000C / 1000D / 1500D
1 Solution
Antonio_Milanese

Hello Lucas,

 

lpiris wrote:

But the big question is:

 

If I have an Active-Active HA with the synchronized sessions, the routing table synchronized with the SLAVE, you can check

with the command:

 

get info kernel router

 

Why to forcibly turn off the MASTER, OSPF need to renegotiate?

 

What do you understand by that? That's right? I have a case in the TAC, but no satisfactory conclusion.

this is because only the master unit it's actually running OSPF process (and btw there is only ONE OPSF process system wide), HA process simply injects (sync every route-wait+route-hold) OSPF route entries into other slaves FIB and marking those routes according to a specific TTL (route-ttl). IF you try to dig onto slaves with "get router info ospf database" you will see what i'm saying anyway there is a decent explanation of OSPF + GR at http://kb.fortinet.com/kb....do?externalID=FD34881 Why OSPF process lives only on master and it's not a "stetefull mirrored" process? IMHO i think that developers have realized that there is not a one-size-fit-all solution and maybe in some scenarios a statefull mirrored routing process/lsdb could be a backfire (stretched or transparent deployments, GRArp not viable,ecc): in those scenarios even OSPF GR could be a detriment and you want to adjust route-ttl,route-wait,route-hold to aggressively invalidate routes and establish newer adjacencies ASAP. Hope the explanation sounds satisfactory..maybe one of the forum gurus can add/correct the above speculations! Best regards, Antonio

View solution in original post

9 REPLIES 9
Dipen
New Contributor III

I am attaching the screenshots of Routing Tables as seen in GUI of Master and Slave Units. As we can see that OSPF Route is visible in Slave Unit as well as " HA" Route. But when actual Failover happens then these " HA" routes disappears and are re-learened. Is there a way to prevent re-learning during HA failover.

Ahead of the Threat. FCNSA v5 / FCNSP v5

Fortigate 1000C / 1000D / 1500D

 

Ahead of the Threat. FCNSA v5 / FCNSP v5 Fortigate 1000C / 1000D / 1500D
Antonio_Milanese

Hello Dipen, you should enable ospf gratefull restart on FGT and your core so restarting device can quickly resume full operation without recalculating algorithms: config router ospf set restart-mode graceful-restart end To avoid forwarding disruption during failover due to FIB invalidation you should also increase HA routes ttl to accomodate switchover duration and ospf gracefull restart config system ha set route-ttl 60 set route-wait 60 set route-hold 60 end Hope this help. Regards, Antonio
Dipen

Does Fortigate use IETF Graceful Restart or Non-IETF Graceful Restart Do we have to tweak restart-period also ?

Ahead of the Threat. FCNSA v5 / FCNSP v5

Fortigate 1000C / 1000D / 1500D

 

Ahead of the Threat. FCNSA v5 / FCNSP v5 Fortigate 1000C / 1000D / 1500D
Dipen
New Contributor III

I have done the changes as suggested by you.(In Fortigate only but not in core switch). Now when Failover happens (disconnect one monitored port) it gives only one ‘timeout’ and routing-table is not rebuilt. This is what was required. Thanks for the suggestion. However when I revert (connect the port back) then it still gives five ‘timeouts’ is there a way we can prevent this as well ?

Ahead of the Threat. FCNSA v5 / FCNSP v5

Fortigate 1000C / 1000D / 1500D

 

Ahead of the Threat. FCNSA v5 / FCNSP v5 Fortigate 1000C / 1000D / 1500D
Antonio_Milanese

Hello Dipen,
However when I revert (connect the port back) then it still gives five ‘timeouts’ is there a way we can prevent this as well ?
you trigger a failback in a HA with asym priority or it' s a A-P HA ?
Does Fortigate use IETF Graceful Restart or Non-IETF Graceful Restart
AFAIK both: restart-mode graceful-restart == IETF RFC3623 retart-mode lls == non IETF NSF (say cisco mode) anyhow you should enable OSPF GR onto adjacent routers so they could obey (GR helpers) to NSF restart or GR process cloud exit abruptly (f.e.apon LSA arrival), or your FIB will be invalidated either by ttl and by failed restart. AFAIK on h3c switches it' s nor GR neither GR helper are enabled by default. About GR grace timer imho the default 120 sec (per RFC IIRC) fits enought topologies. Regards, Antonio
Lucas_Piris

I have the same case with an HA FortiGate 600C.

Configuring the graceful-restart works perfectly.

 

But the big question is:

 

If I have an Active-Active HA with the synchronized sessions, the routing table synchronized with the SLAVE, you can check

with the command:

 

get info kernel router

 

Why to forcibly turn off the MASTER, OSPF need to renegotiate?

 

What do you understand by that? That's right? I have a case in the TAC, but no satisfactory conclusion.

 

Regards

Lucas

 

Antonio Milanese wrote:
Hello Dipen,
However when I revert (connect the port back) then it still gives five ‘timeouts’ is there a way we can prevent this as well ?
you trigger a failback in a HA with asym priority or it' s a A-P HA ?
Does Fortigate use IETF Graceful Restart or Non-IETF Graceful Restart
AFAIK both: restart-mode graceful-restart == IETF RFC3623 retart-mode lls == non IETF NSF (say cisco mode) anyhow you should enable OSPF GR onto adjacent routers so they could obey (GR helpers) to NSF restart or GR process cloud exit abruptly (f.e.apon LSA arrival), or your FIB will be invalidated either by ttl and by failed restart. AFAIK on h3c switches it' s nor GR neither GR helper are enabled by default. About GR grace timer imho the default 120 sec (per RFC IIRC) fits enought topologies. Regards, Antonio

Antonio_Milanese

Hello Lucas,

 

lpiris wrote:

But the big question is:

 

If I have an Active-Active HA with the synchronized sessions, the routing table synchronized with the SLAVE, you can check

with the command:

 

get info kernel router

 

Why to forcibly turn off the MASTER, OSPF need to renegotiate?

 

What do you understand by that? That's right? I have a case in the TAC, but no satisfactory conclusion.

this is because only the master unit it's actually running OSPF process (and btw there is only ONE OPSF process system wide), HA process simply injects (sync every route-wait+route-hold) OSPF route entries into other slaves FIB and marking those routes according to a specific TTL (route-ttl). IF you try to dig onto slaves with "get router info ospf database" you will see what i'm saying anyway there is a decent explanation of OSPF + GR at http://kb.fortinet.com/kb....do?externalID=FD34881 Why OSPF process lives only on master and it's not a "stetefull mirrored" process? IMHO i think that developers have realized that there is not a one-size-fit-all solution and maybe in some scenarios a statefull mirrored routing process/lsdb could be a backfire (stretched or transparent deployments, GRArp not viable,ecc): in those scenarios even OSPF GR could be a detriment and you want to adjust route-ttl,route-wait,route-hold to aggressively invalidate routes and establish newer adjacencies ASAP. Hope the explanation sounds satisfactory..maybe one of the forum gurus can add/correct the above speculations! Best regards, Antonio

Lucas_Piris

Hi Antonio,

 

Thank you for your explanation.

 

Best Regards

Lucas

 

Antonio Milanese wrote:

Hello Lucas,

 

lpiris wrote:

But the big question is:

 

If I have an Active-Active HA with the synchronized sessions, the routing table synchronized with the SLAVE, you can check

with the command:

 

get info kernel router

 

Why to forcibly turn off the MASTER, OSPF need to renegotiate?

 

What do you understand by that? That's right? I have a case in the TAC, but no satisfactory conclusion.

this is because only the master unit it's actually running OSPF process (and btw there is only ONE OPSF process system wide), HA process simply injects (sync every route-wait+route-hold) OSPF route entries into other slaves FIB and marking those routes according to a specific TTL (route-ttl). IF you try to dig onto slaves with "get router info ospf database" you will see what i'm saying anyway there is a decent explanation of OSPF + GR at http://kb.fortinet.com/kb....do?externalID=FD34881 Why OSPF process lives only on master and it's not a "stetefull mirrored" process? IMHO i think that developers have realized that there is not a one-size-fit-all solution and maybe in some scenarios a statefull mirrored routing process/lsdb could be a backfire (stretched or transparent deployments, GRArp not viable,ecc): in those scenarios even OSPF GR could be a detriment and you want to adjust route-ttl,route-wait,route-hold to aggressively invalidate routes and establish newer adjacencies ASAP. Hope the explanation sounds satisfactory..maybe one of the forum gurus can add/correct the above speculations! Best regards, Antonio

Lucas_Piris

Hi,

 

This is a solution:

 

config router ospf set restart-mode graceful-restart end

config system ha set route-ttl 60 set route-wait 60 set route-hold 60 end

 

:D

 

Now HA works as expected.

Labels
Top Kudoed Authors