Jump to content
Welcome to our new Citrix community!
  • 0

Critical performance degradation on provisioning + SMB3 client


Djon Smit

Question

Test bench
OS win2019 hyper-v PVS 2112 is installed on the hyper-v virtual machine.
Simple Layer 2 switches 2 SFP + 10G ports and 24 1G ports.
Testing, the program of linear reading from the C - drive of the client machine is running.
a network drive is connected via the SMB3 protocol from the virtual machine on which the PVS is located.
On a network drive on a client machine loaded with PVS, a second test task was launched, archiving a winrar file of 10 gigabytes without compression to achieve the maximum read / write speed.

Reading from the C - drive sinks to almost zero up to short stops.
A critical decrease in the speed of reading from the C - drive occurs for all clients connected to one switch, even if only 1 client is tested in the switch according to the specified algorithm. Other clients run read-only from the C - drive, even if there are only 2 active clients on the switch and the switch is under 10 percent load.
The switch itself has a performance headroom of 10 times.

Looks like low priority traffic generated by PVS.
It is not known why the performance degradation occurs for all PVS clients in one switch on all ports only for PVS resources, as if it were a hub and not a switch.
Despite the fact that in another switch connected through the first switch, direct reading from the C -  drive occurs without drawdowns, simultaneously with drawdowns on the first switch, for the clients of the second switch the situation is repeated when a similar scenario is run.
At the same time, I tested a similar program from another manufacturer under the same conditions with the same scenario on the same hardware, on the same virtual machine with the same settings.
I got a normal distribution of network performance, on the client there are no drawdowns for the C - drive
On all loaded clients, an increase in the performance of reading from the C -  drive in one switch is almost 100 times in comparison with PVS
Can get full load on 10G common channel.

This testing was done on different physical hardware and it was always possible to get the same results.
Perhaps inexpensive switches are to blame, but why then do the same switches work fine for another similar program?
What happens to PVS traffic when all other traffic is working fine?

 

 

 

description.png

Link to comment

4 answers to this question

Recommended Posts

  • 0

I am not sure exactly what you are testing here  -- some questions:

  • What are PC1/2/3 running? Are they PVS target machines? So reading the C drive actually results in PVS streaming traffic?
  • Am I right that in addition to running PVS targets from the PVS server, you are also using the PVS server as a file server AND you have PC2 writing a large file to a share on that machine at the same time that you are running the read test? Where is the source of this 8GB file being written? Is it also on the boot disk?
  • What is the network utilization like on the PVS server? and PC 1/2/3?

BTW - your test is not really representative of normal usage patterns in Windows - it's more like the worst possible boot storm case where every target is loading all of windows into memory for the first time.

 

Simon

 

Link to comment
  • 0

Drive C is the target of the VPS for all tested computers.
The file server for the tests was located on a separate virtual machine and on the same virtual machine as the VPS server. The results are similar.
The drives used are NVME SSD Micron 9200 MAX.
Moreover, these disks were separate for the UPU and file storage.
Network usage on PC 1,2,3 is as shown in the diagram.
At the time of testing, there was no other activity on the switches.
The performance of the equipment is sufficient.

It would be too easy to lose performance due to a slow disk, it would be reflected in another test program.
This scenario is far from the worst, the usual scenario when working with a file server. When one client is working with file storage, the braking of the C drive is experienced by all clients in one switch, and if several clients will work with file storage, can you imagine what will happen?
The other program does not have these drawbacks.
If you just read the VPS from the C drive, there are no problems on a large number of clients, you can get very high performance, but if some competing traffic, especially smb3 in read-write mode, interferes, big problems begin.
This can be unnoticed until a certain point, until the performance drops very much. I wonder if this is just my special case? Why hasn't anyone noticed this until now, if it has been going on for many years in past versions?
If braking started at a significant load, and here there are only 2 clients.
This needs to be fixed.

Link to comment
  • 0

I have difficulty fully understanding the description.

Perhaps the test scenario introduces higher latency than may be observed in normal circumstances, or there is unexpected network saturation in the scenario (giving rise to some packet loss or high latency).

 

 

PVS traffic is UDP based, and expects a reliable network with no packet loss and low latency - single digit milliseconds (which enterprise networks would expect to deliver)

Network latency in double digit milliseconds can substantially reduce PVS streamed C drive I/O characteristics.

Intermittent UDP packet loss and higher latencies can also be accommodated by PVS via the PVS retry mechanism, but when triggered would be observed as reduced I/O or intermittent drops in I/O against the streamed C drive.

 

SMB3 traffic is largely TCP based.

 

 

Question:

During problem behaviour, do you notice retries increase on PVS targets, in PVS status tray tool?

 

Test: 

Is it possible to test if the switches achieve actual 10Gb throughput on the 10Gb ports? 

There will be differences in maximum TCP versus maximum UDP.

Usually on 10GB switches you would expect TCP to achieve 10Gb throughput with minimal loss, but UDP may max out around 5Gb.

Determining maximum throughput for both TCP and UDP should be determined.

 

Test:

Can you perform similar testing, but removing the mentioned switches?

Perhaps PVS targets directly on the same host as PVS servers and SMB share, using hypervisor internal host 10GB+ virtual networking?

SMB test should also be able to utilise the increase in available bandwidth.

 

Design Choice:

Some customers choose to entirely separate PVS traffic from other production traffic, ensuring that on PVS network segments, there is only PVS traffic.

In the described use case, where there is high PVS network utilisation as well as unlimited additional other network utilisation expected it may be beneficial.

However, this design is not be required in most customer use cases, and can add unnecessary complexity.

Link to comment
  • 0

You are right, similar software works under the IP protocol, I give the result under the same conditions, the picture for PC 3 did not change, it works without problems. For the sake of curiosity, I took a slightly outdated enterprise-class adapter for testing the Juniper EX 4200, but I will be able to test it much later. Do you see these performance graphs? Are you saying that the VPS server will increase productivity at the expense of expensive hardware? What is the advantage of this?
The access speed of the VPSserver, according to the UPD protocol, does not have any advantages even in ideal conditions for the VPS server without competitive traffic in comparison with the analogue. For that, with the competitive traffic of the VPS, the server does not stand up to criticism on conventional equipment. I will check it on corporate equipment, but even if the VPS server will work without drawdowns on corporate equipment, this reduces its contingency. Maybe it makes sense to enable support for the IP protocol, which will expand the capabilities of the VPS the server will not make the program an outsider?

Instead of separating and isolating traffic, it is better to improve the VPS server !

description.png

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...