NetScaler VPX monitor error – Timeout during SSL handshake stage


I came across this by accident while setting up NetScaler GSLB for a Citrix solution for one of my customers. The service groups in NetScaler were giving an error message for the monitoring probes – https and CITRIX-XD-DDC (both secure). The NetScaler was VPX running 11.1.63.9nc firmware.

Last response: failure – Time out during SSL handshake stage

How do you troubleshoot such issues? There are a couple of options.

Wireshark can be a good tool or you could use the built-in nstrace utility in NetScalers. Additionally, you will also need to make sure that the required ports are open between your NetScaler and the backend servers/services.

If you run a trace and look at it, you can see the below

TLSv1 Record Layer: Alert (Level: Fatal, Description: Unsupported Certificate)
Content Type: Alert (21)

The certificate that was installed on the DDCs and Storefront servers were created with a key size of 4096 and that was the issue.

Fix was to generate a fresh certificate with 2048 key size and the issue will be gone. Also, note that this issue is only prevalent in VPX versions of NetScaler. NetScaler MPXs will NOT exhibit this issue due to their built-in SSL chips.

Hope, this helps somebody out there.

How to verify that NetScaler COOKIEINSERT persistent sessions are working?


Ever had issues with your Storefront load balancing setup not working after painstakingly setting up NetScaler load balancing?

Now, let’s discuss a few reasons why cookie based persistence are better (in some scenarios) compared to Source-IP based persistence.

In some circumstances, using persistence based on source IP address can overload your servers. All requests to a single Web site or application are routed through the single gateway to the NetScaler appliance, even though they are then redirected to multiple locations. In multiple proxy environments, client requests frequently have different source IP addresses even when they are sent from the same client, resulting in rapid multiplication of persistence sessions where a single session should be created. This issue is called the “Mega Proxy problem.” You can use HTTP cookie-based persistence instead of Source IP-based persistence to prevent this from happening.

If all incoming traffic comes from behind a Network Address Translation (NAT) device or proxy, the traffic appears to the NetScaler appliance to come from a single source IP address. This prevents Source IP persistence from functioning properly. Where this is the case, you must select a different persistence type.

HTTP Cookie Persistence

When HTTP cookie persistence is configured, the NetScaler appliance sets a cookie in the HTTP headers of the initial client request. The cookie contains the IP address and port of the service selected by the load balancing algorithm. As with any HTTP connection, the client then includes that cookie with any subsequent requests.

When the NetScaler appliance detects the cookie, it forwards the request to the service IP and port in the cookie, maintaining persistence for the connection. You can use this type of persistence with virtual servers of type HTTP or HTTPS. This persistence type does not consume any appliance resources and therefore can accommodate an unlimited number of persistent clients.

Note: If the client’s Web browser is configured to refuse cookies, HTTP cookie-based persistence will not work. It might be advisable to configure a cookie check on the Web site, and warn clients that do not appear to be storing cookies properly that they will need to enable cookies for the Web site if they want to use it.

By default, the NetScaler appliance sets HTTP version 0 cookies for maximum compatibility with client browsers. (Only certain HTTP proxies understand version 1 cookies; most commonly used browsers do not.) You can configure the appliance to set HTTP version 1 cookies, for compliance with RFC2109. For HTTP version 0 cookies, the appliance inserts the cookie expiration date and time as an absolute Coordinated Universal Time (GMT). It calculates this value as the sum of the current GMT time on the appliance and the time-out value. For HTTP version 1 cookies, the appliance inserts a relative expiration time by setting the “Max-Age” attribute of the HTTP cookie. In this case, the client’s browser calculates the actual expiration time.

So some of the benefits of using cookie based persistence are

  • Easy to configure
  • Load balancer does all the work without involving the back end services (mostly a web server)
  • No connection table to maintain
  • Activity based – Time out is based on idle connection and not the total connection time
  • caters to most of the load balancing scenarios
  • if proxies are involved in the setup where source IPs are masked, this makes it an easy choice

Cons of Cookie based persistence

  • can throw off the load balancing where timeouts are set to 0 or when time outs are long lived
  • when Type 0 cookie is used, time of the cookie value is absolute(GMT time). When type 1 is used, relative time is used (local time)
  • hard to troubleshoot

Testing COOKIEINSERT persistence

Use Putty or some other SSH clients to login to the NetScaler.

sh lb vserver vservername

If you take a closer look at the snippet above, the letters in yellow shows that COOKIEINSERT persistence is turned ON with a time out value of 0 mins. This means that the cookies doesn’t have an expiry time set by the NetScaler appliance. The backup persistence is set to SourceIP with a time out value of 60mins. You can also see the cookie values at the bottom of the results staring with NSC_xxxxxxx

You will now need to get into Shell with an account that has permissions. To get to shell, type shell and hit enter

Now type the following

nsconmsg -i vservername -s ConLb=1 -d oldconmsg

In the snippet above, the highlighted text in yellow shows that persistence is set to Cookieinsert and the number of hits for each service. You will also get values such as response times, CPU and memory utilization.

VIP value shows the total number of hits to the vServer hits
Each service has a line starting with S (IP Address) and its status.
Hits (x, x/sec) shows the initial (method) hits to the service for a new connection.
P (x, x/sec) shows hits to the service that are served using a persistence cookie

Also, check the current persistent session via the commands below

sh persistentsessions
sh persistentsessions -summary

NetScaler Commands – Obtain Running Processes ( NetScaler’s Task Manager)


Here is how to get to the task manager equivalent tool on NetScalers. This is quite useful to check who is running what and how much resources are consumed.

Use your favorite SSH tool and connect to the management IP of the NetScaler

>shell top

The output should look like the below
Capture19

Now, you must be wondering what is that process in the snippet that takes all the CPU on NetScaler. NSPPE is the NetScaler Processing Engine of NetScaler. Some refer it to the NetScaler’s operating system. The above behaviour is absolutely normal. Don’t press the panic button just yet 🙂

NetScaler Commands – Obtain Event Logs


There are many a times you may want to look at the NetScaler event logs and the below command should let you do just that.

As always, use your favorite SSH tool to connect to NetScaler and run the following commands one after the other.

 

#shell 
#/netscaler/nsconmsg -K /var/nslog/newnslog -d event | more

Please be careful to use capital K (this is for reading the logs and a LOWER case “k” is for writing to the NetScaler event files). Keep hitting Enter key to scroll through all the event logs

NetScaler Commands – Obtain Performance Statistics


There are a number of ways to obtain performance stats from NetScaler and one of the most common methods is outlined below

Use PuTTY to connect to the management IP of the NetScaler and run the below commands

#shell 
#nsconmsg -s ConLb=2 -d oldconmsg

Capture3

Please note that ConLb=2 is case sensitive.

-s ConLb=2  is a detail level for load balancing statistics

-d oldconmsg  is display statistical information

You can press CTRL +C to end the live statistics screen.

Another couple of commands that you can use is

>stat ns 
>stat system detail

>stat system detail

The below is how the stat ns output would look like
Capture21

The below is how the stat system detail output would look like

Capture22