How to verify that NetScaler COOKIEINSERT persistent sessions are working?


Ever had issues with your Storefront load balancing setup not working after painstakingly setting up NetScaler load balancing?

Now, let’s discuss a few reasons why cookie based persistence are better (in some scenarios) compared to Source-IP based persistence.

In some circumstances, using persistence based on source IP address can overload your servers. All requests to a single Web site or application are routed through the single gateway to the NetScaler appliance, even though they are then redirected to multiple locations. In multiple proxy environments, client requests frequently have different source IP addresses even when they are sent from the same client, resulting in rapid multiplication of persistence sessions where a single session should be created. This issue is called the “Mega Proxy problem.” You can use HTTP cookie-based persistence instead of Source IP-based persistence to prevent this from happening.

If all incoming traffic comes from behind a Network Address Translation (NAT) device or proxy, the traffic appears to the NetScaler appliance to come from a single source IP address. This prevents Source IP persistence from functioning properly. Where this is the case, you must select a different persistence type.

HTTP Cookie Persistence

When HTTP cookie persistence is configured, the NetScaler appliance sets a cookie in the HTTP headers of the initial client request. The cookie contains the IP address and port of the service selected by the load balancing algorithm. As with any HTTP connection, the client then includes that cookie with any subsequent requests.

When the NetScaler appliance detects the cookie, it forwards the request to the service IP and port in the cookie, maintaining persistence for the connection. You can use this type of persistence with virtual servers of type HTTP or HTTPS. This persistence type does not consume any appliance resources and therefore can accommodate an unlimited number of persistent clients.

Note: If the client’s Web browser is configured to refuse cookies, HTTP cookie-based persistence will not work. It might be advisable to configure a cookie check on the Web site, and warn clients that do not appear to be storing cookies properly that they will need to enable cookies for the Web site if they want to use it.

By default, the NetScaler appliance sets HTTP version 0 cookies for maximum compatibility with client browsers. (Only certain HTTP proxies understand version 1 cookies; most commonly used browsers do not.) You can configure the appliance to set HTTP version 1 cookies, for compliance with RFC2109. For HTTP version 0 cookies, the appliance inserts the cookie expiration date and time as an absolute Coordinated Universal Time (GMT). It calculates this value as the sum of the current GMT time on the appliance and the time-out value. For HTTP version 1 cookies, the appliance inserts a relative expiration time by setting the “Max-Age” attribute of the HTTP cookie. In this case, the client’s browser calculates the actual expiration time.

So some of the benefits of using cookie based persistence are

  • Easy to configure
  • Load balancer does all the work without involving the back end services (mostly a web server)
  • No connection table to maintain
  • Activity based – Time out is based on idle connection and not the total connection time
  • caters to most of the load balancing scenarios
  • if proxies are involved in the setup where source IPs are masked, this makes it an easy choice

Cons of Cookie based persistence

  • can throw off the load balancing where timeouts are set to 0 or when time outs are long lived
  • when Type 0 cookie is used, time of the cookie value is absolute(GMT time). When type 1 is used, relative time is used (local time)
  • hard to troubleshoot

Testing COOKIEINSERT persistence

Use Putty or some other SSH clients to login to the NetScaler.

sh lb vserver vservername

If you take a closer look at the snippet above, the letters in yellow shows that COOKIEINSERT persistence is turned ON with a time out value of 0 mins. This means that the cookies doesn’t have an expiry time set by the NetScaler appliance. The backup persistence is set to SourceIP with a time out value of 60mins. You can also see the cookie values at the bottom of the results staring with NSC_xxxxxxx

You will now need to get into Shell with an account that has permissions. To get to shell, type shell and hit enter

Now type the following

nsconmsg -i vservername -s ConLb=1 -d oldconmsg

In the snippet above, the highlighted text in yellow shows that persistence is set to Cookieinsert and the number of hits for each service. You will also get values such as response times, CPU and memory utilization.

VIP value shows the total number of hits to the vServer hits
Each service has a line starting with S (IP Address) and its status.
Hits (x, x/sec) shows the initial (method) hits to the service for a new connection.
P (x, x/sec) shows hits to the service that are served using a persistence cookie

Also, check the current persistent session via the commands below

sh persistentsessions
sh persistentsessions -summary

Advertisements

Don’t let your user-experience be a “Spectre” of itself after “Meltdown”


Bust your ghosts not your user experience

The names Spectre and Meltdown invoke feelings of dread in even the most seasoned IT engineer.  To those uninitiated, let me get you up-to-speed quickly.

Spectre is a vulnerability that takes advantage of “Intel Privilege Escalation and Speculative Execution”, and exposes user memory of an application to another malicious application.  This can expose data such as passwords.

Meltdown is a vulnerability that takes advantage of “Branch prediction and Speculative Execution”, and exposes kernel memory.  A compromised server or client OS running virtualized could gain access to kernel memory of the host exposing all guest data.

Both vulnerabilities take advantage of a 20-year-old method of increasing processor performance.

Server_Protection

As a result, code will need to be updated to address these vulnerabilities at OS and OEM-manufacturer levels, at the expense of system performance.

On their part, Microsoft reluctantly admits that performance will suffer.  “Windows Server on any silicon, especially in any IO-intensive application, shows a more significant performance impact when you enable the mitigations to isolate untrusted code within a Windows Server instance,” wrote Terry Myerson, Executive Vice President for the Windows and Devices group.

According to Geek Wire, these two vulnerabilities which take advantage of a 20-year-old design flaw in modern processors can be “mitigated;” the word we’re apparently using to describe this new world in 2018, in which servers lose roughly 10 to 20% performance for several common workloads.

This affects not only workloads executed against local, on-site resources but even those utilizing services, such as AWS, Google Public Cloud or Azure.

cpu_utilReader submission @ The Register showing CPU before / after patches

We’ve heard from some of our insiders who use Login VSI to validate system performance that they’re seeing a reduction of 5% in user-density after performing Microsoft recommendations. Knowing that the vulnerability wasn’t solved by OS updates alone we, at Login VSI, wanted the ability to test the impending hardware vendor firmware / BIOS changes.

Now is the time to capture your baseline performance

How do you know how much of an impact the fixes for Spectre and Meltdown will be if you don’t have anything to compare it to? Keep in mind that these patches will need to be installed on a number of systems in your solution including server hardware, operating systems, storage subsystems and so on.

Many of our customers perform tests where they compare a known good solution, or a baseline, with changes that have been made. This gives them the ability to accurately assess the performance impact of that change, which in turn allows them to compensate with more hardware, or further tuning of the applications and OS. The patented methods used by Login VSI provide a quantifiable result for determining the impact of a change in virtual desktop and published application environments.

Using Login VSI

If you wish to test the changes before pushing them into your production environment, then use Login VSI to put a load, representative of your production users, on the system. This will objectively show how much more CPU will be used as a result of the Spectre or Meltdown patches. It is expected that the end users will incur increased latency to their applications and desktops as a result of the higher CPU utilization.

Using Login PI

While it is not recommended, if you are planning on pushing the patches into your production environment to “see how it goes”, then install Login PI now to get an accurate representation of performance related to user experience. This will give you the ability to then compare to that same experience after the patches have been installed. We expect that you will see latency to the end user increase as a result of higher CPU utilization. If you already struggle with CPU utilization in your solution, there is a good chance you’ll be also using Login PI to test your availability.

As we complete our testing we will be sharing our findings in a series of articles.

If your computer has a vulnerable processor and runs an unpatched operating system, it is NOT SAFE TO WORK WITH SENSITIVE INFORMATION”. – Security Experts who discovered Meltdown / Spectre 

If sensitive data is part of your business (Such as ours!) patching is not a matter of if, but when.

Ask yourself:

How long can you afford to have your company’s data exposed to malicious intent?  Do you want to be the next Equifax or Target?

In this article series, we will provide some insight from our lab environments. Be aware your results may vary based upon individual workload and configuration.

Microsoft has released a Security Advisory

The vulnerability affects both the client and server OSs of Windows.  This is compounded when dealing with large-scale published application and desktops deployments.  The advisor can be found at the following location:

https://portal.msrc.microsoft.com/en-US/security-guidance/advisory/ADV180002

The specific details addressed in the security update and Windows KB are outlined in the Common Vulnerabilities and Exposures database.

Included are:

To completely protect yourself there are two phases of patching this vulnerability.

1 – Windows OS updates

2 – OEM device manufacturer firmware updates (not yet available)

Microsoft acknowledges addressing these vulnerabilities from a software perspective is limited, and therefore, without the OEMs providing updates the loop is not closed.

In the interim we can start measuring the impact of the Microsoft fixes.

They offer guidance for both Desktop and Server OSs:

Desktop –  January 2018 Security Update. Security Advisory: Click Here!

Server –  KB405690. Security Advisory: Click Here!

NOTE – Certain AV solutions are not compatible with the security update released by Microsoft. As such, unless an AV vendor has a registry flag, QualityCompat, they will not receive the January Security update and will still be vulnerable

With the upcoming OEM hardware patch releases we expect to be able to produce a variety of interesting and informative results.  Please stay tuned for the next articles!

Reference materials:

https://meltdownattack.com/

https://www.theregister.co.uk/2018/01/09/meltdown_spectre_slowdown/

https://www.geekwire.com/2018/microsoft-admits-meltdown-spectre-patches-will-hit-windows-server-performance/

Check the realtime performance/statistics of a NetScaler vServer


There are many a time you would want to look at the vServer performance due to many reasons – you are just too curious to see all the numbers, you have an issue with excessive resource utilization due to a particular vServer or you are seeing unusual amount of hits in Netscaler Insight Center graphs.

There are a couple of ways you could check it – via the NetScaler GUI and via the command line.

Via the GUI, its quite simple as all you need to do is to select the vServer and click on Statistics button to see the performance counters. the result will be as follows

vserver-counters1

Via the NetScaler GUI

Via the command line , it’s a bit more powerful as it gives the overall resource usage of the NetScaler appliance as well on top of all sorts of info at regular intervals. Here is what you need to do to check the realtime stats of a vServer

>shell
#nsconmsg -i vServername -s ConLb=2 -d oldconmsg

via-putty

via Putty / Command Line

 

XenApp & XenDesktop 7.x – Error “Incompatible Settings on SDK” on Delivery Groups


My colleague came across this error message while working with a customer where he had to prevent Citrix Desktops from being shown to users if they are in a particular AD group. He didn’t recall what he did wrong but he ended up with Desktops doubling up for a standard user who isn’t a member of exclusion group.

Inspecting the delivery group, he noticed Desktops per user settings  under User Settings has a different value “Incompatible Settings on SDK

desktopsperuser

Querying the Delivery group

Get-BrokerEntitlementPolicyRule

Going through the results, there is an additional desktop without any filtering applied. The fix is to remove the additional desktop. In his case, it was named was “Desktop_2”

Remove-BrokerEntitlementPolicyRule -Name "Desktop_2"

Running the get command shows the below results.. the second desktop is gone!!!

startbutton

Hopefully this helps someone.

XenApp & XenDesktop 7.x – Logon delay of 20 seconds at ” Please wait for Citrix User Profile Manager”


If you ever have a situation where your Citrix logons sits on “Please wait for Citrix User Profile Manager” for around 20 – 25 sec, you have come to the right post. I have had this strange one where users started to see their logons taking longer than usual. Below is my setup and your results may vary depending on the UPM version that you are running

  • XenApp 7.5 site
  • Server 2012 R2 VDAs
  • Citrix UPM 5.1
  • McAfee AV 8.8

Issue Manifestation – Delay of 20-25 seconds on ” Please wait for Citrix User Profile Manager” and this was consistent.

Remediation – Turn ON Citrix Profile Streaming

I am partially to blame for this issue as I had turned OFF Profile Streaming when I had my users complaining that the logoff was taking a little while. Now i need to figure out what is causing the logoffs to take roughly 10 seconds to complete. Already looked at the AV side of things and i have the required Exclusions. Hunt continues and will update with the findings..

So that’s a quick one there for you guys and hopefully someone will find it useful. If you find it working/not working for you, let me know