Jump to content
Updated Privacy Statement
  • 0

VM perfromance impacted during live VM migration


Question

Hello Folks,

 

I am using XenServer 7.0. So we have 8 server in pool and using NFS shared storage for the Pool.  We have couple of VMs running on it, but there are few VMs with 64GB of DB VMs which usually has higher work load because of DB. ( OS Type - CentOS )

 

So what we have observed so far, when we try to perform VM live host migration from one host to another, such DB VMs always get hang and migration doesn't complete. So we have checked the performance of the VM,  When we start the migration CPU spike happens and it reaches almost on 100% and VM gets hang.  So as per my understanding when we intiate the VM live migration, it tries to preserve the state meaning it temporary holds all the write happening on the VMs in order to migrate the resources from one host to another, So it's possible to have the  the IOwait on the VM and cause the spike of CPU ?

 

does this case can be possible of VM hang during the live migration ? Can any expert  provide more clarification on this ?

 

Regards

Vivek Kumar

 

 

Link to comment

5 answers to this question

Recommended Posts

  • 0

Sounds like you need to allocate more memory to dom0. How many vCPUs are assigned to dom0 - enough? With top running during migration, neither the CPU nor memory should hit 100% utilization. Migration is a big resource hog and adequate resources need to be allocated to dom0 to be able to handle migrations more efficiently.

 

-=Tobias

Link to comment
  • 0

Hello Tobias,

 

Thank you so much for your reference.

 

So we have enough resources on the hosts, we have assigned 8 GB RAM to Dom0 and 16vCPU showing in the top command. So when we perform the live migration on the VM what will happen on the backend, does it try to freeze the state of the VM because in Xen Documentation it's clearly mentioned that when we perform VM live migration it impacts on the performance of the VM.

 

So if any VM taking high number of writes, So what will happen to that writes while migrating the VM ?. Because we see a clear CPU spike on the VM performance when we start the migration, Let's say a VM is using 10% of total CPU (In the performance graph of VM ), When i initiate the migration  CPU of VM goes to 100% and it results the VM hang, Here you can see that CPU usage of VM during the migration.

 

image.thumb.png.fa74bfe0fe5bd48eb8ea898355975e9f.png

 

In VM we can see below messages -

 

image.thumb.png.376d8c5f13a91c2f46d6e8e0c680136a.png

 

here is the Hosts CPU status, It was also quite normal that whole day, (Just before 28th)

 

image.thumb.png.08e916d31aba021efbd55c3f6bb5fa5c.pngimage.thumb.png.eb2abe6c5e98a4dd6c34416adde1b056.png

Link to comment
  • 0

Vivek,

That's a big load, and that can happen if it is a big VM with a lot of virtual memory. Even 8 GB may not be enough for dom0 - do you see RAM running out on dom0 when you run top? Also, for migrations to work efficiently - if at all - XenTools has to be properly installed.

This is just trying to move the VM to a different host and not also a storage migration, correct?

 

HTH,

-=Tobias

Link to comment
  • 0

Hello Tobias,

 

Yes..!  This VM is having 64 GB of memory and we are only migrating from 1 host to another that's it. So one very interesting fact we had also observed, VM was getting failed to migrate while DB services was running inside the VM. So when migration failed at the first time and VM went into Hung state,  We stopped the DB services inside the VM (we suspected that VM is probably having high read/writes) and tried to migrate again, and it got successfully moved, 

 

So we have tried this at least 3 times.  VM with active workload got into hang state while migration (Tried twice ) but it got successfully  migrated while no DB services (Mean with no load on the VM)

 

Xentools are properly installed, Load was quite normal on the server, The source hosts is having 512 GB of total whereas hardly 200GB memory was used on the host at the time of migration.

 

 

 

Link to comment
  • 0

You could try allocating more memory to dom0 or posssibly using a different network over which to do the migration, but that's a huge VM. Shutting it down is probably not an option. Not really sure what else to suggest at this point other than to be sure all hotfixes are up-to-date and applied to all pool hosts.

 

-=Tobias

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...