Jump to content
Welcome to our new Citrix community!

HA replication failure, policy session - Workaround reboot warm - New versions July August


Pedro Huaroto M

Recommended Posts

Good morning Team NetScaler, having 2 NetScaler in HA with version 12.1-63.24 and SSL VPN configuration, the session policy configuration works correctly, when performing a failover it also works (the user is charged with their respective session policy) normally .

 

However, when updating the version (both NetScaler) it is observed that when performing a failover, the user does not load his session policy, if he authenticates but does not load the session policy, he can load the session by default or shows an error.

 

In the NetScaler (changed to primary) it is verified that it contains saved and running configuration, which was replicated from the other NetScaler.


This event is fixed when the primary NetScaler is rebooted in warm mode (previously the other machine is set as secondary).

In the laboratory, this behavior was observed in the versions:
build-12.1-65.17
build-13.0-87.9
build-13.1-27.59

 

Please could you comment on any suggestions, or if the same event was presented to you.
Thank you

Link to comment
Share on other sites

12.1.63 was a "buggy" 12.1 build from a GUI standpoint. I don't recall if there were other issues. But I don't think that alone is a factor yet.

Going from 12.1 to 13.1 is problematic as you're likely dependent on classic engine policies going to a build that may have classic deprecations and so features may not be present after upgrade.  This can be confirmed if settings are being lost by comparing the running configs on primary and secondary and looking for gaps.

Check syslog on both appliances for ha errors.

 

Otherwise, I don't understand what you are describing.

If your 2 ADCs are on the same version and in an ha pair, do you have the same gateway config and session policy on both ADCs, PRIOR to testing an upgrade?

Are you testing the failover while they are on same or different builds? (Just to clarify scenario.)

Are you saying that if a user connects to the NetScaler A when it is primary as a gateway user it works; but after failover it doesn't? 
(Or otherwise clarify what you mean about not loading a session policy.) First to confirm that the policy is present on both appliances. And after failover is a new logon required or does the logon carry over?  Are you using any complex expressions either classic or advanced which may be affected by the upgrade to a new build?

 

Next, possible troubleshooting to clarify the behavior:

Licensing checks in the HA Pair:

  • Are both ADCs licensed with individual, appliance platform licenses tied to the appropriate licensing MAC address so both systems are licensed for the Gateway feature. 
  • Are you using an ADC Standard/Adv/Premium or a Gateway only license?
  • Are you doing a full vpn connection or a gateway in ICA Proxy mode?  IF VPN do you also have enough "gateway user" licenses (univeral licenses) assigned to hostname on BOTH appliances, with hostname set the same on BOTH appliances

If you have issues after failover with the existing session, try the following:

  • Failover the HA pair from NSA to NSB.
  • Clear user cookies and does a new logon to NSB (new primary) work or still have issues. This means there is a config difference, firewall issue, or licensing issue between the appliances. IF new sessions work on new primary appliance, but only existing sessions are resuming after failover, that may narrow down what needs to be investigated.
  • Do the appliances have same clock/timezone settings AND is global http paramters set to v1 cookies (instead of v0).

Start by confirming that when both appliances are on the same, original build that connections work and that with an ha failover you see the behavior you expect without issues.

Then we can look at the "mixed mode" upgrade issue.

But remember, when your on different builds, propagation and synchronization are suspended.  

So, again does a new session work on new primary and the only issue is the existing sessions or other problem? Does the issue resolve once both appliances are on the same build?

 

Link to comment
Share on other sites

Hi, Thanks Rhonda,

 

The answer is given below and detailed:

"... If your 2 ADCs are on the same version and in an ha pair, do you have the same gateway config and session policy on both ADCs, PRIOR to testing an upgrade?..."

Response:
Yes, both ADCs, in the initial scenario, (12.1-63.24) have the same configuration and work well when failover between the ADCs is performed (the session policy is applied according to the configuration).


"... Are you testing the failover while they are on same or different builds? (Just to clarify scenario.) ..."
Response:
Failover is performed with the same version on both ADCs.

"... Are you saying that if a user connects to the NetScaler A when it is primary as a gateway user it works; but after failover it doesn't? ..."

Answer: Yes, that's right, with higher versions, when the secondary ADC changes to primary, it does not apply the session policy, and when the other ADC is returned to primary, it also stops working, and the workaraund is to restart the primary ADC with the warm reboot option (being the other adc in secondary stay so that it does not go to primary).

 

It was verified that the session policy is on both computers (Save and running).
In production, the session policy has several options, but in the laboratory a simple policy was created, and the same thing happens, it does not apply when the failover is performed:
(Lab Setup)
add vpn sessionAction 2633 -clientlessVpnMode ON
add vpn sessionPolicy pol_2633 true 2633
bind aaa user 1633 -policy pol_2633 -priority 100

 

On the ".. Licensing checks in the HA Pair..." comment:
Response:
- The ADCs have an Advanced license, and functionality is checked with version 12.1-63.24, with the other versions it is also shown enabled and in green, with the Ok synchronization message.


- The configuration is a vpn tunnel with a plugin in production, and in the laboratory it was reduced to only clientless, with the same result.

- The problem is solved only if the primary ADC is restarted with the warm reboot option (with the other adc in secondary stay so that it does not go to primary).

- ADCs have the same time and time zone.

 

Failover only works fine with version 12.1-63.24 and fails with build-12.1-65.17, build-13.0-87.9 and build-13.1-27.59.

 

With the versions build-12.1-65.17, build-13.0-87.9 and build-13.1-27.59, when performing the failover and it does not work, both ADCs with the same version, the workaround is the reboot with the warm option.

 

Thanks you

Link to comment
Share on other sites

OK - that clarifies the behavior.  I can also see that your Session policy is advanced engine and your changing versions.

A few more things to check.

1) What does your vpn user and ica proxy user license counts show on BOTH appliances:  show ns license (especially before or after failover). This will show us if you are missing universal licenses after failover, which would affect your VPN/clientless license acquisition as you are not using the ICA Proxy licenses in this scenario.

2) What does your license show for AAA (or authentication)?  (Also, does this change before or after failover.)

3) Within your session profile are you doing any client security/epa scans which may be based on classic engine on the security tab (which may be impacted in some cases.)

 

This may require a call to support. Its possible that your license file may be missing something to make gateway work on newer builds if it was issued under older licensing models.

Otherwise, I'm looking to see if there is something affecting your user count licenses after upgrade (with the above questions).

 

We'll see if anyone else has better ideas.

 

Link to comment
Share on other sites

Thank you,

 

Item 1, answer: It is verified that both NetScalers have the same licenses, I don't think it is a license issue because when restarting in warm mode it works. Even in the laboratory, with a single session policy and a user, the problem is replicated at 90%.

 

Item 2, answer: Authentication is successful, even when switching, but does not load session policy, until reboot warm is performed.

 

Item 3, answer: epa is not used.

 

The support is still analyzing, the rare thing is the workaround that is the reboot warm and the tests in the laboratory with the indicated versions and minimum configuration. Thanks.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...