Jump to content
Welcome to our new Citrix community!
  • 0

7.4 > 7.6 Migration Failed:


Mauritz Swanepoel

Question

I have seen numerous similiar issues related to the same problem (originating from version 7). Today I faced the same issue, I live migrated a VM from Server A to Server B, it ran for a while then stopped with the following error below:

 

"Failed","Migrating VM 'vmname' from 'serverA' to 'serverB'
Internal error: Storage_interface.Does_not_exist(_)
Time: 00:09:48","dilbert","Mar 25, 2019 3:47 PM"

 

Both servers uses LVM - This particular VM is only 40GB in size, and serverA had about 1TB available. serverB has 16TB available, so this isnt a resource related issue. Needless to say, on serverB there no is a VDI of 40GB which cannot be removed from XenCenter, and on serverA the VM was in a suspended state. Trying to continue the VM said the following:

 

"Failed","Resuming VM 'vmname'
VM cannot be resumed because it has no suspend VDI","serverA","Mar 25, 2019 4:03 PM"

 

I was able to get it working again which was a relief, but this is casting a lot of shadow on the trust of our enviroment. I have of late only experienced severe issues with how Xenserver works with storage and to be honest, I wish I never walked the Xen road (a little too late for us now).

 

As live migration is a standard feature accross all hypervisors, I would really like to understand why the above is happening and what can I do to get this fixed (issues have been for years now it appears).

 

 

Link to comment

6 answers to this question

Recommended Posts

I would dig into SMlog to see if anything more clean can be found as to the root cause.  I haven't dug into how those storage migrations actually work.

I'm not a big fan of storage migrations, anything goes wrong and its entirely possible to lose data. But of course it should have worked.  Since this is 

local SR's with LVM I assume data transfers over the management interface to accomplish the transfer. Possible issues or congestion or driver issues

with the management interfaces could cause this. Do you have bonded management interfaces ? That could be a root cause as well.

 

--Alan--

 

Link to comment

Could be a number of things, including an incompatible network interface it can plug into on that end (lack of a supported subnet, for example). Most of the issues I've had are with slight CPU masking inconsistencies.

 

You might try the migration from the CLI and see if maybe there's a "--force" option and even if not, you're likely to get more of an error message out of the CLI than from XenCenter.

 

-=Tobias

Link to comment

Like what Alan and Tobias Suggested, Assuming this is local to local.

 

  1.  Are the CPUs consistent and same ? - Because CPU flags play amajor role when a Live Migration happens 
  2. Is XenServer tools upto date ? - Because VM suspend is handled by PV tools 
  3. Management Network Congested ? 
  4. Errors in /var/log/SMlog  or /var/log/xensource.log 

 

 

 

Link to comment

I have opened a bug https://bugs.xenserver.org/browse/XSO-944 

 

Thank you all for your inputs, I also understand it's not "your product" so my rants are simply to get my opinion out there. We build our entire business on the Citrix platform, so when issues emerge (and always seems to be related to just weird ways of how storage works) one has to be able to vent.

 

The way I see this is that it's either local or part of some kind of incompatibility. If it's local, I completely understand and more than happy to admit to fault. In our particular case, both h osts are identical (except for storage) and we've not had a single issue before related to network. We do daily and weekly full snapshots and they never fail. Unless proven otherwise, for now I am not looking at our infrastructure as the cause of the issue.

 

If it's due to incompatibility (CPU's, management tools etc), surely it's up to the vendor to first do a check of the "vitals" before continuing with the migration? Live migrations, done through a GUI, should never break due to incompatibilities. It appears to be common knowledge as to what causes these migrations to fail (and you have hinted that data lose can also occur) so how is it that Citrix does not verify or at least warn a user upfront during migration of incompatibilities? It has access to both servers during migration so should be able to verify all issues (even in the event where the new storage may be too small to complete the transfer). The result is a broken migration with VDI's "stuck" on the host that cannot be deleted, no decent information is given to troubleshoot and if you dont have a support contract with Citrix you're stuffed. We pay for our licenses but support is an adhoc cost which is crazy. 

 

I know I am venting but it's just frustrating to have this feeling of "doom" looming over our infrastructure and not being able to perform a simple live migration (which is also forced on you as the versions changes twice a year and Citrix only supports what appears to be a 6 months worth of versions). 

 

I'm hoping some of the developer guru's can help me with the bug request, will keep this thread posted 

Link to comment

The forced upgrade model doesn't work well for us either so thats why we stay 7.1 LTSR which of course is a licensing cost. 

I don't do any storage migrations with local storage so if there are issues related to that I wouldn't ever see them. All of 

my storage is eitiher iSCSI or NFS and I rarely even move that storage with storage migration. I've had loss of data in the 

past so not a fan of storage migration at all.

 

--Alan--

 

 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...