Citrix Monitor Service – Memory Leak

Based on my previous post, I want to add another bug. This time it’s a memory leak on the Citrix Monitor service that frequently happens on the XenDesktop 7.1 DDCs (it takes all available memory). The Director cannot be used when you experience the memory leak.

The issue is known and there is a private hotfix. In case you need the hotfix, request it from the Citrix Support.

Update 20.08.14: The issue still exists in XenDesktop / XenApp 7.5. There is another private hotfix for 7.5. Unfortunately it’s still not yet released.

Update 07.10.14: I just got a final hotfix for that issue. It’s still yet not released for public. In case you need it, create a Support case and refer to XD710BrokerSvcWX64002 for 64bit and XD710MonitorSvcWX86002 for 32bit.

XenDesktop 7.1 – Open issues

I’m currently working on a upgrade project from XenApp 6.5 to XenDesktop 7.1 App Edition. Yes, Citrix renamed their product from XenApp to XenDesktop App Edition. Means the existing XenApp has been merged with the existing XenDesktop product. Basically a good idea to have both products in one console, but many people were confused about the new name and did not understand that XenDesktop App Edition is the former XenApp. Therefore, Citrix renamed their XenDesktop App Edition product again (it’s a never ending story…) back to XenApp. So with the upcoming 7.5 version, you will find again XenApp 7.5 and XenDesktop 7.5. That just as a side – I wanted to write about issues I’m facing after a migration from XenApp 6.5 on Windows Server 2008 R2 to XenDesktop 7.1 App Edition running on Windows Server 2012.

Issue 1 – High CPU Usage when running Citrix Receiver (i.e. V4.1) on a Windows Server 2012:
Ever tried to install Citrix Receiver on a Windows Server 2012 and then launching some published desktops or apps? Yes? No issues? Yeah, that’s absolutely possible when not running the server on a XenServer. We realized that if you run a W2K12 Server on XenServer and start some published apps/desktops from there, you will face the issue that all the HDX Engine processes (wfica32.exe) of all users working on that server will immediately increase the CPU usage if you play sound. Technically, as soon as audiodg.exe is running on that server, all HDX Engine processes will immediately and constantly increase their CPU usage by some percentage. One wfica32.exe process represents one connection to a published app or desktop. So the more such sessions are running on the server, the higher the impact. In our environment, we see a CPU usage increase of 1/3. You won’t see or even detect the issue on a server with just a few users, but what if you have designed your servers to serve 30, maybe 40 users? There you feel the pain with the result that you can only serve 2/3 of these 30, 40 users because the impact on the CPU usage is that high.

So what is the solution? I’m afraid, unfortunately there is no fix out yet to address that issue. We have an open case with Citrix. They confirmed that this is a bug, but they also communicated that it’s a complex issue which means they need several weeks in order to provide a private or public hotfix. I will update that post as soon as I got a fix for it.
The other thing I can provide is a workaround:
Stop or disable the Windows Audio service. This will prevent you to play any sounds on your server, but the issue is gone. If the performance of your servers and the user experience is more important than having the ability to play multimedia, that’s a valid option.

09.04.2014 – Update: After having said that it’s a bug in Citrix Receiver, Citrix believes it’s an issue on Windows Server 2012 running virtualized. But they are not totally sure about it, that’s why I haven’t received an official statement. So we created an MS support case as well. Let’s see who is causing the issue…update will follow.

03.06.2014 – Update: After giving the ball from Citrix to Microsoft back and forth, Citrix finally acknowledged that it’s their turn to provide a solution. So it’s not a bug in Microsoft’s OS. Citrix provided us a private fix for Citrix Receiver. It’s not a final fix that completely solves the root cause, but at least allows as to finally enable audio on our virtualized servers. Let me know if you are interested in knowing the root cause in detail and I will do my best to publish it in my blog (it’s really a complicated issue that is going into the details of the OS). Beside providing us a workaround with a private hotfix, Citrix acknowledged that they have to rewrite the code in one of the critical parts for having a long-term solution. I’m glad that we now have a solution (although it’s not the final one) after “fighting” with Citrix and Microsoft for several months. 🙂

09.12.2014 – Update: Issue has been fixed in Receiver 4.2. You can set the polling interval with a registry key:

SlowHPCPolling

The longer the interval, the less CPU used. We use 16ms in our production environment and don’t see any negative side effect.

Issue 2 – Load Management based on Tags:
Having the Load Management set up properly is very important because most probably there is a direct impact on the user experience if it’s either not set or incorrectly set . In my scenario, we do have our VDA servers split up in 3 different types of servers. Type 1 are the servers with the best performance, type 2 second and type 3 with the worst performance (from a hardware perspective). So the goal is to have more users on type 1 servers than on type 2 and 3 and more users on type 2 than on type 3 because there is more CPU and memory available.

If you navigate to the Load Management options in the Citrix Computer Policies, you have the following choice:

Citrix XenDesktop 7.1 Load Management

Citrix XenDesktop 7.1 Load Management

We decided to use “Maximum number of sessions” as Load Management decision maker. All good so far. We create 3 separate policies, one for each server type. And then we want to have the policies applied to the search with the according server type:

– LOAD100 = Server Type 3
– LOAD140 = Server Type 2
– LOAD160 = Server Type 3

Load Management in XenDesktop 7.1

Load Management in XenDesktop 7.1

If you have a closer look at the screen shot, you observe that a last step, the filtering of the policies to the according server types is missing. Citrix offers the following filter options that are applicable to XenDesktop 7.1:

– Delivery Group
– Delivery Group Type
– Tags

Because we have all servers in one Delivery Group, the only option that makes sense is Tags. Before we can configure the Tag filter, we have the set the tags. You can do that in the Citrix Studio. Right-click on a server, choose Add Tag and done. We have set the tags LOAD100, LOAD140 and LOAD160, based on the server type. Alright, we are close to the end. Just add the according Tag filter to the policies (it’s an Allow filter), choose the tag from the drop-down list and done.

So with that solution, you would be able to have 3 different load configurations based on the tags you set on the servers. The only thing is….yes, it doesn’t work. All servers will get the default configuration.

Solution? No solution yet, we have a case open with Citrix and are awaiting to have the issue fixed. I will update the post as soon as I know more from Citrix. I would have loved to set the filter on Machine Catalogs, that would be a nice workaround when filtering on Tags doesn’t work, but it’s not possible.

09.04.2014 – Update: Citrix finally communicated that this isn’t a bug, it’s an expected behavior. Means that Citrix policies filtered by tags doesn’t work for Shared Desktops. They will update the Citrix documentation because it’s absolutely not clear that this isn’t supported. We found an alternative in the meantime. If you work with AD GPOs, just create additional GPOs with different Load settings and apply them to your servers based on the security filtering. We created AD groups and added the servers into those. This way the servers get the correct load settings. Just make sure that the order of the applied GPOs is correct so that the right policy will be applied. Problem solved….

Issue 3 – the random BSOD caused by vdtw30.dll
We are struggling with random BSODs caused by vdtw30.dll. It’s really random and happens occasionally. Therefore it’s hard to troubleshoot. There are some fixes done in XenDesktop 7.1 that should prevent strange behaviors caused by vdtw30.dll, but there seems to be another one around. Again here, there is an open case with Citrix that brings us to the next and last issue for today….

bsod

03.06.2014 – Update: We have got 2 private fixes already which reduced the frequency of BSODs, but it’s still not gone. I will provide an update as soon as it is completely solved.

Issue 4 – Unable to create a dump file from a provisioned Windows Server 2012 on XenServer 6.1 or 6.2:
The Citrix Support requests a complete memory dump to troubleshoot the issue 3 from above what I completely understand. But what if you configure the provisioned server the way it should create the complete memory that file but won’t do it? Welcome to issue 4! 🙂

There is a Citrix Support article that tells you how you can get a complete memory dump from a provisioned machine: http://support.citrix.com/article/CTX123642. This might work on Windows Server 2008 R2, but it won’t on Windows Server 2012 (running on XenServer 6.1). I’m still in contact with the Citrix support and hope to get a solution soon. In the meantime I managed to get at least a minidump file. Based on the article, you need a local storage disk on which you save the dump file. Just add another local storage disk to your (virtualized) server and restart the server. You don’t have to create a partition on that disk, just add it as additional disk. If you do that, the dump file gets created, as well the complete memory dump file. The only problem is that the complete memory dump file will be corrupt, the minidump not…

Update: There is a registry key you can adapt so that creating a complete dump file works. This is the answer of the Citrix Support:

We can actually emulate the disk we are attaching for the dump collection.

To emulate the disk that contains the crashdump path, edit HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Xenfilt\Parameters, remove the appropriate entry (probably “0”) from the REG_MULTI_SZ “VBD” and reboot the VM. The appropriate entry will be the Position as shown by XenCenter, 0 for ATA Channel0 master, 1 for ATA Channel0 slave, 2 for ATA Channel1 master (ignored, as DVD-ROM is always emulated), 3 for ATA Channel1 slave.

That works perfectly and I’m really happy that we could finally manage it as it’s important to have the option to get a complete memory dump for troubleshooting reasons.

 

That was it for today….as you can see there are still many open issues with XenDesktop 7.1 which are absolutely critical in a productive environment. I have some other critical issues in mind, but that’s enough for today. If I got some minutes, I will extend that post and let you know about other issues….

And by the way….XenDesktop/XenApp 7.5 will be launched this month, maybe we are lucky and almost all these issues are gone with the new version 😉

Regards,
Michael