Jump to content
  • 0

XenServer 8 lack of performance


Question

Posted

With the implementation of Xenserver for our VDI workload we a strugling with performance of the VDI.

From hour own investigation we think the problem has something to do with the numa assignment.Why: we see at many times that de vcpu assignment is not assigned to a single socket, but assign accros cores on both sockets. This leaves the preffered single socket numa.

1 answer to this question

Recommended Posts

  • 0
Posted

(was'nt ready typing)

With the implementation of Xenserver for our VDI workload we a strugling with performance of the VDI.

From hour own investigation we think the problem has something to do with the numa assignment.Why: we see at many times that de vcpu assignment is not assigned to a single socket, but assign accros cores on both sockets. This leaves the preffered single socket numa.

 

Hardware hypervisor

Dell PowerEdge R7525

CPU 2x AMD EPYC 75F3 32-Core Processor

RAM 1024Gib Ram

Storage: SSD 900GiB

HT active resulting in 128 cores

 

Bios is set to maximum performance mode.

Additional settings tested in the bios settings, non of them giving a beter performance

  • Numa nodes per socket (default 1)
  • L3 cache as numa domain disabled/enabled, tested with enabled, gives a lot of little numa's
  • Memory interleave on/off

 

Citrix Intrastructure

Citrix functional level is 2203 LTSR.

 

Configuration of the Xenserver is updated to the latest XenServer 8 release (as of 7/11/2024)

 

Within Xenserver the following configurations have been made

  • Read cache on the storage, and Domain0 memory enlarged to 32Gb for more read caching en thus offloading
  • Checked and configured setting in xenopsd.conf file with numa-placement=true
  • Configured xe host-param-set uuid=<uuid >numa-affinity-policy=best_effort

This changed a littlebit we see it tries to do a numa, although many times it is still across the 2  cpu sockets

 VDI VM configuration

  • Windows 10 22h2, all windows updates
  • 4vCPU. Tried 4 sockets-1 core/2 sockets - 2 cores/1 socket - 4 cores setup . 4 sockets with 1 core seems to perform best.
  • 14GiB memory
  • Machines are setup with MCSIO with write caching in RAM 4 GiB, and 10GiB cache disk on the local SSD storage.
  • Optimization is done with the Citrix optimizer
  • XenTools are installed and up to date, enabled I/O optimization

 

The vm limit is set to 64vm's on this host.  64x4=256 vCPU's. With a 2:1 setting.

If we reduce the number to 4 VM's on this host. The performance is still not well.

We tested with modifying the sched-cred  to  "xl sched-credit -s -t 10 -r 1000"  this helped a bit in performance

Is there some logging options or other thinks we can do to check the lack of performance

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...