Jump to content
Welcome to our new Citrix community!
  • 0

Poor streaming performance on PVS server

Rick Culler




We have two physical PVS servers with 48 threads and 128GB of RAM each, serving two vdisks to approximately 1200 VMs, these servers have been in place for a number of years. In the last 2-3 weeks, we've run across a situation where the performance of the streamed VMs is extremely poor (to the point of being almost unusable) right from booting up to using the VM in the Xendesktop session. We have verified with a Citrix engineer that we are not experiencing packet loss on the network, but rather there seems to be latency between the PVS server and the vdisk store (the vdisks are stored locally on each PVS server on a RAID5 array of SAS drives, which we've had used in this configuration for quite some time. Also while on the line with the Citrix engineer, we thought we had found the culprit in needing to add AV exclusions into our image(s), this did seem to resolve it initially, but after a week later or so, we are seeing similar things with the AV exclusions added in. During all our testing it does appear to be latency on the PVS servers on the storage system for the vdisks. For some context, some of the VMs have taken anywhere from 10 minutes -> an hour or more to boot each. 


Environment Specs:
PVS 2203 CU2 streaming to Win10 VMs on XenServer 8.2 with latest patches

Cache mode: Cache to RAM with overflow to local HDD, RAM set at 512MB

CVAD 2203 CU2


PVS server - CPU is 5% usage, RAM is 8% usage, vdisk store is a bit more active


My main question is:
What kind of performance metrics are ok for the vdisk volume on a healthy environment?

Right now, after having all VMs recently rebooted about 7 hours ago with only about 40 users connected, is the disk activity is usually a constant 5-7 MB/s, Disk Activity time is a constant 85-95%, and disk queue length hovers just under 1, however we have seen it go up around 1.5. Attached is a pic of Resource Monitor we see. The main problem is, I don't know or remember what a healthy streaming environment looks like.  The primary vdisk image that serves abut 1150 VMs is a merged base, so there are no deltas on top.



And my second question is:

From the VM image, does anyone have some ideas of what specifically to look for in Task Manager/Resource Monitor, etc to understand what might be throwing things off? I'm assuming look for anything that might be causing a lot of disk reads, but any other tricks of the trade?


We have rolled back the image to a version where it was working well...but something still doesn't feel right in terms of performance, and when we did a small delta version on top of that rolled back version, then performance took a nose dive. Where the only change we made was update a couple web browsers, and installed a third party plugin for MS Word. 


Any tips and hints is greatly appreciated. 


Link to comment

2 answers to this question

Recommended Posts

  • 0

The configuration of the PVS servers would be expected to give good performance, specifically

128GB of RAM each, serving two vdisks 

What does task manager say on the memory section of performance tab, for "cached".

Windows will cache in local memory (memory unused for other purposes) reads from the local file system.

On PVS servers this is primarily vdisk reads. So it would be expected with only 2 vdisks in use and 128GB of ram, that majority of vdisks reads will be answered using windows cache (probably well over 100GB of ram available for vdisk reads).

This is true unless the PVS servers were rebooted recently, a windows reboot clears cached reads from memory. If PVS server local read caching is needed for good performance in the environment, reboots must be handled with care/planning avoided during heavy loads (large volumes of target reboots) and usual production times.

For only ~6MBs of disk read activity, and a windows disk queue length of ~1 does look like a weak point, but most PVS reads should be answered from memory cached file reads.

If you suspect target device reads are a source of slow performance, you can usually see that in network utilization on PVS servers.

On prem 10 Gbe networks often max out around 5-6 Gbps in UDP throughput.

Once approaching maximum UDP throughput, general streaming performance and boot time would be significantly degraded, latency would be high when PVS streaming bandwidth capacity is close to maximum utilization.


Below CTX has an example PowerShell script to retrieve processes with highest reads from target devices:
It is the same data as visible in task manager, going to the details tab, and adding in column "I/O read bytes"



Link to comment
  • 0

Found the culprit of the degraded performance, so I'll post here in case someone else runs into something similar.


It turns out something that got installed on one of our vdisk images was causing constant reads from the PVS server. So what we saw was constant ~5MB/s reads on disk activity and ~1 Disk queue length on our volume that has the vdisk stores. There was no spikes in the reads or queue length, it was constant, even after mulitple hours of booting up. 


Once we reverted back to an image about 10 versions prior, we saw a noticeable difference in the disk activity once the vdisks were loaded into RAM, as the Windows login screen showed up, the disk activity dropped like a stone at almost the exact same time. Of course we'd see some odd spikes in disk reads while things contiunued to be streamed/cached, but we haven't seen the same level of consistent disk reads since.

We'll go over our change log to try and identify the cause. Unfortuantely it seems like whatever it was, happened a while back, as we had to go back quite a few versions back.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Create New...