Jump to content
Welcome to our new Citrix community!
  • 0

Long-time nagging problem with Linux VMs crashing during live migration. Possible work-around!


jamie bennion1709153033

Question

Since around XenServer 7.6ish I've had a problem with some Ubuntu Linux VMs crashing AFTER live migration.

Problem still exists for me on Hypervisor 8.0

The crashes are very peculiar. After the live migration completes, the VM will be ok for up to a few minutes. No symptoms.

But soon, processes will hang. Network traffic will hang. Console/Bash seems ok at first but commands don't work. Eventually console output like this happens and the VM becomes completely hung. See : memstuff2.png

memstuff2.thumb.png.a646e4357896056a052215822b18f5aa.png

 

The problem exists on Ubuntu 16.04, 18.04, and 20.04. And seems to be MORE common on VMs that have been running for a while (weeks, months...).

In the Resources tab of XenCenter, it seems the VMs top out their memory graphs when the crash happens.

 

The "VMs have been running a while" and memory things I thought might be clues.

 

So I tried clearing the VM's cached memory right before live-migration. SO FAR, this seems to address the problem. These Ubuntu Linux VMs are memory caching pretty close to 100%, and when they migrate it seems to be a straw breaking a camels back.

 

To do this, on the VMs I've been running as root :

sync; echo 1 > /proc/sys/vm/drop_caches

... And verifying the free and cached memory values are more like a freshly booted VM. See image memstuff1.png with before and after memory values.

memstuff1.thumb.png.31b102248fa57fa204164f771c1d7e39.png

These VMs seem to live-migrate just fine now, but I have to clear the cache on a VM before migrating it.

 

Has anyone else had this problem?

Is there a better way to handle this?

 

 

 

 

Link to comment

5 answers to this question

Recommended Posts

  • 0
4 minutes ago, Tobias Kreidl said:

The other option would be of course to shut the VM down and migrate it "un-live," but that's clearly less desirable.

Is this only taking place with Ubuntu VMs?

 

Pretty much all of my Linux VMs are ubuntu, so I don't have a lot of others to test against, but it doesn't happen with linux appliances vendors provided, or windows VMs.

I can try a centos VM.

Link to comment
  • 0
3 minutes ago, Tobias Kreidl said:

Hmm, might be an Ubuntu bug then. Seems off that it's restricted to only that OS if it's a Citrix issue!

 

-=Tobias

Totally possible. I've been running XS since 6.0, so I'm not a total noob to the game.

 

Keep in mind that Ubuntu 16.04 to 20.04  covers quite a few kernel versions and the problem has been there for us no matter what XS patch or linux patch. From about 7.2 to 8.0.

 

Anyway if anyone else runs into this problem, that's the work-around: clear the VM memory cache.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...