Jonathan Hoppe Posted November 6, 2014 Share Posted November 6, 2014 We monitor hundreds of websites with HTTP monitors using secure and port 443, essentially checking a website over HTTPS and looking for a 200 response code. Pretty typical stuff. Several of these sites have the monitor consistently fail, and when we look at the servicegroup to see why, the monitor says "Last response: failure - Time out during SSL handshake stage". This is cropping up more and more, and we can't figure out why. We've looked at the typical settings like the response time-out to be sure that wasn't too low, SSL certificate errors, and we tried "GET /" instead of "HEAD /" as well, all without success. To dig deeper, I SSH'd into the Netscaler and dropped to the shell and did a "curl -IL -k "https://www.example.com" to see what would result (example.com is not the real website, of course) in an effort to mimic what the Netscaler's monitor might be doing. I got back this: "curl: (35) Unknown SSL protocol error in connection to www.example.com:443" Okay, this is somewhat helpful, although not entirely. But for other sites that give the same "Time out during SSL handshake stage", curl is not helpful because it shows "HTTP/1.1 200 OK" on the response, which makes it even more perplexing as to why the monitor is down. I'm posting today with the hope that someone has encountered a similar issue in the past and may have some ideas and/or troubleshooting strategies to see if we can get to the bottom of this. Right now we've had to use simple PING monitors for these sites instead, but that is far from accurate since PING might work when Apache/IIS is down, as we all know. Thanks! Link to comment Share on other sites More sharing options...
Paul Blitz Posted November 7, 2014 Share Posted November 7, 2014 Why not run a TCP trace on the Netscaler, and see what the actual monitors are doing, rather than trying to mimic them? Not only will you be able to see the packet details, you'll see the timings. If you put your SSL key onto wireshark, it will (if recent) also decrypt the SSL packets. Also worth remembering that HTTP 1.1 says you should have a Host header, which the default monitors don't actually have... Link to comment Share on other sites More sharing options...
John Smith1709152147 Posted November 7, 2014 Share Posted November 7, 2014 We are having the same problem. We have found that the monitor work fine with SHA-1 certificates, but they fail with SHA256 certificates. Are you using SHA256 certificates? We are having the issue when attempting to monitor two Web Interface servers runningon Windows Server 2008 R2. We are using VPX 3000s on 10.5. Did you find a solution? We have been working with Citrix support...thus far, to no avail. I am on the line with them now...just found your post. Link to comment Share on other sites More sharing options...
Jonathan Hoppe Posted November 7, 2014 Author Share Posted November 7, 2014 John, I get the error with SHA1 certs too, so that doesn't seem to be the case for me. I also have successful HTTP monitoring with both types of certs, so it is something unique. It could be an IPS, firewall or something else on the other end too, for all I know. I'm working on setting up a test VPX device so I can run a trace like Paul suggests above. My production netscaler appliance is so busy that running a trace for 1 minute makes Gigs and Gigs of data. I'll post any results. Link to comment Share on other sites More sharing options...
Paul Blitz Posted November 11, 2014 Share Posted November 11, 2014 Re live box: can't you use a filter on the trace to pre-limit the data you capture? But yeah, a lab VPX would work well (although do remember that VPX doesn't do TLS 1.1 or 1.2, which could be part of the issue!) I'm assuming that you have "normal" timings for the monitor, so it's not a REAL timeout issue? Link to comment Share on other sites More sharing options...
Sanjith Abraham1709153204 Posted November 11, 2014 Share Posted November 11, 2014 Possible reasons : 1) If you have a firewall in between these servers which is patched with "Poodle sslv3 block" , its possible that the packets are dropped on firewall when Netscaler uses sslv3 for ssl handshake . Better disable sslv3 on the services forcing service monitors on tlsv1 . 2) backend server are over consumed with resources , and is rejecting some ssl connections . 3) backend servers have multiple interfaces , and some return traffic are not routed back to Netscaler as its taking a different interface and looping in your network . Link to comment Share on other sites More sharing options...
Jonathan Hoppe Posted February 11, 2015 Author Share Posted February 11, 2015 I can't believe that 3 months have gone by, but I finally had a couple hours to spare today, so I ran a trace and captured what the monitor was doing. This was the result: TLSv1 Record Layer: Alert (Level: Fatal, Description: Unsupported Certificate)Content Type: Alert (21) So I dug further to find the difference between this monitored device and others and found that this device has a certificate with a 4096 bit key. So I did some more testing and indeed, 4096 bit keys are not supported. 2048, no problem. Maybe if I have extra time in the next week, I'll try a 3072 to see how that goes. So the next question to the folks at Citrix is WHY!!! This will become a huge problem in a very short amount of time. Hopefully it is on the roadmap for support. Link to comment Share on other sites More sharing options...
Jonathan Hoppe Posted February 12, 2015 Author Share Posted February 12, 2015 For what it is worth, I spun up an older Netscaler VPX with 9.3 - 54.4.nc and it is happy to check SSL certs with a 4096 bit key. I then upgraded that same Netscaler to 9.3 - 68.3.nc and it fails. So it seems Citrix downgraded this functionality at some point. : ( Link to comment Share on other sites More sharing options...
Paul Blitz Posted February 12, 2015 Share Posted February 12, 2015 http://support.citrix.com/proddocs/topic/ns-faq-map/ns-faq-ssl-ref.html on an MPX, 4096 bit keys supported on back end servers, but on VPX, only 2048 bit keys supported on backend servers. Link to comment Share on other sites More sharing options...
Sanjay Kumar1709157223 Posted August 2, 2016 Share Posted August 2, 2016 Possible reasons : 1) If you have a firewall in between these servers which is patched with "Poodle sslv3 block" , its possible that the packets are dropped on firewall when Netscaler uses sslv3 for ssl handshake . Better disable sslv3 on the services forcing service monitors on tlsv1 . 2) backend server are over consumed with resources , and is rejecting some ssl connections . 3) backend servers have multiple interfaces , and some return traffic are not routed back to Netscaler as its taking a different interface and looping in your network . 1st Option fixed my issue, Thanks. Link to comment Share on other sites More sharing options...
Oscar Moyano Gomariz Posted March 25, 2019 Share Posted March 25, 2019 same problem... https service to exchange and status down. I tried all...use tls1.2 disable firewall some different certificates. And error : Failure - Time out during SSL handshake stage I have netscaler NS12.0.56.20.nc and exchange 2016 Link to comment Share on other sites More sharing options...
Patrick Karall1709152689 Posted June 26, 2019 Share Posted June 26, 2019 Hi, I have ADC 13.0 36.27 and it also seems, that 4096 bit certificates are not supported on backend servers. I got the same error. Then I changed to a 2048 bit certificate and everything is ok. Is there still a known issue with 4K certificates? br, Patrick Link to comment Share on other sites More sharing options...
Patrick Karall1709152689 Posted June 26, 2019 Share Posted June 26, 2019 Hi, I have ADC 13.0 36.27 and it also seems, that 4096 bit certificates are not supported on backend servers. I got the same error. Then I changed to a 2048 bit certificate and everything is ok. Is there still a known issue with 4K certificates? br, Patrick Link to comment Share on other sites More sharing options...
Patrick Karall1709152689 Posted June 26, 2019 Share Posted June 26, 2019 Hi, I have ADC 13.0 36.27 and it also seems, that 4096 bit certificates are not supported on backend servers. I got the same error. Then I changed to a 2048 bit certificate and everything is ok. Is there still a known issue with 4K certificates? br, Patrick Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.