Jump to content
Welcome to our new Citrix community!
  • 0

Poor iSCSI performance only through xenserver


Tim LoGiudice

Question

We have a xenserver pool (7.6 Enterprise, all current patches applied) with two servers and an iSCSI SAN box all connected with 10Gb ethernet.  Each server has two 10Gb NICs, one dedicated to iSCSI SAN traffic and one for everything else.  In the iSCSI box there are two LUNs on identical 12Gb SAS arrays using identical disks.  One of the LUNs is connected to the xen pool as shared storage, the other is connected directly to a VM running in the pool.  Multipathing is turned on in all cases (the xen hosts, and inside the VM)

 

The performance of the LUN connected to xen is awful, and the performance of the LUN connected directly to the VM is great.

Any given VM running any given OS appears to get 30-40MBps to a VHD on the iSCSI SR.  It doesn't matter if it's the only VM running on the pool at the time or not, it will not go over 40MBps.

 

The VM connected directly via iSCSI gets 400-600MBps to the LUN on the iSCSI box, but the very same VM only sees 30-40MBps to a mount on the iSCSI SR. 

 

So the speeds differ dramatically going from the same physical server, over the same physical NIC and switch to the same physical iSCSI controller connected to the same size/type of array made up of the same model disks.

 

There doesn't seem to be much going on on dom0 during my testing.  It's not maxing out any CPU cores (it has 8) or RAM (2.9GB, around 1.3GB used during my testing).  

 

I'm coming up empty on reasons for this performance discrepancy.  It's far too dramatic to be any sort of overhead from xen, so there's got to be some configuration issue that I'm just not seeing.

I have another much older pool, in service for about 9 years now, currently running xenserver 7.0 free edition on much older hardware with only gigabit links (bonded, but shared, no dedicated SAN) to a much older iSCSI box running slow SATA drives and it gets much better performance from it's iSCSI SR than I'm seeing with this new one.

 

If anyone has any idea of potential causes for this I would greatly appreciate hearing them.  I've searched extensively and found nothing that's made any difference.  

Thank you,

Tim

Link to comment

8 answers to this question

Recommended Posts

  • 0

That is dramatic. Do you use a custom multipath.conf file from your storage vendor? Do you use jumbo frames? 

I've never really noticed any difference in testing thinking back on an iSCSI versus a direct mapped LUN to a VM.

If anything the direct mapped LUN inside the VM should be slower.  Now on networking a VM with an iSCSI 

LUN mapped should route traffic out your VM interfaces. The XenServer SR will use your storage interfaces. 

Just typically, anyway. Depending on how many management interfaces you have and how they are configured.

 

--Alan--

 

Link to comment
  • 0

Thanks for the responses.


I'm not using a custom multipath.conf.  I was not using jumbo frames originally, but I turned those on last Friday to see if it would make a difference (it did not).  

The VM with the directly mounted iSCSI LUN has a NIC mapped to the storage NIC on the xen host, so it is using the storage interface as well (there is no path to the iSCSI box via the other NIC so it's definitely using the right one).

 

From dom0 I'm seeing 300-500MBps, so it is just the guests experiencing the slowdown.  In fact if I test the performance of the iSCSI SR from a VM and on dom0 simultaneously, I get 300-400MBps on dom0 and 30-40MBps on the VM.  If I test the iSCSI SR from dom0 and the direct iSCSI LUN from the VM simultaneously, I get 300-400MBps from each. 

 

dom0 does not seem to be overloaded.  Each has 8 cores available, and I see 50-75% usage of one core, and <10% total during testing.  Out of 2.9GB ram, I see 1.3-1.4GB free during testing.  

 

Link to comment
  • 0

You really do need to match which multipath.conf entry you use to the specific model of storage, as that will make a difference; a generic configuration will not work as well, and in some cases, not at all. What is your specific storage device and is there an entry for it in the multipath.conf file? If not, can your vendor supply one?

You could also test the I/O with multipath disabled to see what difference in performance you observe.

 

-=Tobias

Link to comment
  • 0

The storage vendor is Raidmachine.  The model is 7316RE.  It's a simple box with an Areca controller.  It doesn't have an entry in the multipath.conf file, but given what it is I would expect it to behave with the defaults.  The VM it's connected directly to has multipathing configured the same way as the xen hosts and the performance doesn't appear to be impacted.  I'll contact the vendor to see if they have any thoughts.  I'll also try disabling multipathing on the xen hosts to see if it makes any difference.

 

Thanks

Link to comment
  • 0

Actually, the defaults in many cases do not work at all.  Check the HCL to see if this unit is officially supported, and regardless, I'd contact the vendor to see if they can provide a custom entry. The very short list of entries provided by Citrix will only cover very few devices and I would venture to say that if not tailored specifically to a device, performance will be unpredictable. My direct experience has only been with devices that have vendor-supplied multipath entries and while results were not as good as direct connections from a VM, they were not degraded by a factor of many.

 

-=Tobias

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...