Jump to content
Welcome to our new Citrix community!

ADC HA Pair - what happens if the HA link is broken


Anand Gopinath

Recommended Posts

Dears 

 

We have an ADC HA Pair  which is  cross site  (  NS1  in DC A  , NS2 in DC B  ) 

 

Failsafe mode is enabled  only  on the current primary Appliance  NS1 . 

 

What will happen if the Cross site link breaks  and both Appliances stay online  ?  Will there be a split brain  or   will the primary appliance continue to function as normal  ? 

 

 

Link to comment
Share on other sites

IF neither system sees their partner, they could both taken on primary role in a split brain scenario.  

 

Configuring route monitors can help ensure that the ADC only becomes primary if they can't see partner AND they can confirm the target network is reachable:  https://docs.citrix.com/en-us/citrix-adc/current-release/system/high-availability-introduction/configuring-route-monitors-high-availability.html

 

If you are crossing datacenters, HA pairs are only intended for LOW latency connections and not wan links.  Also, you likely need an INC-mode config if networks/routes are different.

 

Quick added notes:

Failsafe mode, just helps mitigate dual secondary scenarios (with no primary, by forcing an unhealthy primary to keep trying until a healthy alternative is present. It doesn't prevent split brain/dual primaries.

Stay Primary might help you prefer one specific site over the other as the preferred primary and only failover when down (to keep primary when both are up on one side); but also won't prevent split brain scenarios if they lose connectivity with each other.

 

  • Like 1
Link to comment
Share on other sites

On 3/16/2022 at 7:12 PM, Rhonda Rowland1709152125 said:

IF neither system sees their partner, they could both taken on primary role in a split brain scenario.  

 

Configuring route monitors can help ensure that the ADC only becomes primary if they can't see partner AND they can confirm the target network is reachable:  https://docs.citrix.com/en-us/citrix-adc/current-release/system/high-availability-introduction/configuring-route-monitors-high-availability.html

 

If you are crossing datacenters, HA pairs are only intended for LOW latency connections and not wan links.  Also, you likely need an INC-mode config if networks/routes are different.

 

Quick added notes:

Failsafe mode, just helps mitigate dual secondary scenarios (with no primary, by forcing an unhealthy primary to keep trying until a healthy alternative is present. It doesn't prevent split brain/dual primaries.

Stay Primary might help you prefer one specific site over the other as the preferred primary and only failover when down (to keep primary when both are up on one side); but also won't prevent split brain scenarios if they lose connectivity with each other.

 

 

Thank you for the quick help Rowlad  . Much appreciated  . my query is answered   ?

 

Just one clarification  regarding failsaf mode .  should we enable it on both HA nodes or just on the primary as we have done   ?

Link to comment
Share on other sites

 

I posted about it here too, but in general its better to set it on both (but we're usually talking about ha pairs in same location, where split brain risks are much lower).

https://discussions.citrix.com/topic/407710-about-fail-safe-mode-in-an-ha-configuration/

And Jonathan Clark has some notes here:  https://discussions.citrix.com/topic/400283-netscaler-failsafe-mode/

 

Usually:  If failsafe is set and everyone fails at one time, the one that was up longest/last retains the role until a healthier adc is available.  If you set it on only one node and that node is offline; then the remaining node might not be in a "failsafe" mode condition if something else were to happen to it.  

 

So depending on the type of failure of the first node (failsafe on), if it is offline and only the secondary (failsafe off) is available and it has its own issue, you still have a risk of no primary.

If you have a reasonable belief that you've mitigated the dual primary/split brain issue, then I would likely keep failsafe enabled on both nodes.

 

Link to comment
Share on other sites

On 3/18/2022 at 2:46 PM, Rhonda Rowland1709152125 said:

 

I posted about it here too, but in general its better to set it on both (but we're usually talking about ha pairs in same location, where split brain risks are much lower).

https://discussions.citrix.com/topic/407710-about-fail-safe-mode-in-an-ha-configuration/

And Jonathan Clark has some notes here:  https://discussions.citrix.com/topic/400283-netscaler-failsafe-mode/

 

Usually:  If failsafe is set and everyone fails at one time, the one that was up longest/last retains the role until a healthier adc is available.  If you set it on only one node and that node is offline; then the remaining node might not be in a "failsafe" mode condition if something else were to happen to it.  

 

So depending on the type of failure of the first node (failsafe on), if it is offline and only the secondary (failsafe off) is available and it has its own issue, you still have a risk of no primary.

If you have a reasonable belief that you've mitigated the dual primary/split brain issue, then I would likely keep failsafe enabled on both nodes.

 

Thank you Rowland for the quick response 

 

Fully clear now  ?

 

Thanks alot  

  • Like 1
Link to comment
Share on other sites

On 3/18/2022 at 2:46 PM, Rhonda Rowland1709152125 said:

 

I posted about it here too, but in general its better to set it on both (but we're usually talking about ha pairs in same location, where split brain risks are much lower).

https://discussions.citrix.com/topic/407710-about-fail-safe-mode-in-an-ha-configuration/

And Jonathan Clark has some notes here:  https://discussions.citrix.com/topic/400283-netscaler-failsafe-mode/

 

Usually:  If failsafe is set and everyone fails at one time, the one that was up longest/last retains the role until a healthier adc is available.  If you set it on only one node and that node is offline; then the remaining node might not be in a "failsafe" mode condition if something else were to happen to it.  

 

So depending on the type of failure of the first node (failsafe on), if it is offline and only the secondary (failsafe off) is available and it has its own issue, you still have a risk of no primary.

If you have a reasonable belief that you've mitigated the dual primary/split brain issue, then I would likely keep failsafe enabled on both nodes.

 

Thank you Rowland for the quick response 

 

Fully clear now  ?

 

Thanks alot  

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...