Jump to content
Welcome to our new Citrix community!
  • 0

Need help on messed up Management IP Addresses in a pool


Daniel Sidler

Question

One for the gurus out there.

 

I tried to modify the Management IP setting on my 4-host XS Pool. The actual change was simple,  as I only needed to modify the list of DNS servers configured for the hosts by removing one of the three IPs I had set.

Unfortunately XenCenter messed up the management IP addresses of the slaves, leaving the pool in a semi-manageable state. Here's what happened

 

Before;

Xen1: Mgmt Address xx.xx.xx.51

Xen2: Mgmt Address xx.xx.xx.52 (this is the pool master)

Xen3: Mgmt Address xx.xx.xx.53

Xen4: Mgmt Address xx.xx.xx.54

 

Now, according to XenCenter

Xen1: Mgmt Address xx.xx.xx.53

Xen2: Mgmt Address xx.xx.xx.52 (pool master, unchanged)

Xen3: Mgmt Address xx.xx.xx.54

Xen4: Mgmt Address xx.xx.xx.51

 

But in reality this not true at all! When I use XenCenter to connect to the console of Xen4, I'm actually on Xen1. And I cannot connect to the consoles of Xen1 and Xen3. It's as if XenCenter failed half way when it was modifying the Mgmt IPs of the slaves.

 

It gets more confusing when I putty into the hosts and check with  ifconfig:

Xen1: Mgmt Address on xapi2 is xx.xx.xx.53 , but xsconsole tells me mgmt IP is xx.xx.xx.55!

Xen2: Mgmt Address on xapi2 is xx.xx.xx.52 (pool master, as expected)

Xen3: Mgmt Address on xapi2 is xx.xx.xx.54, but xsconsole tells me mgmt IP is xx.xx.xx.53!

Xen4: Mgmt Address on xapi2 xx.xx.xx.51, but it thinks of itself as Xen1 according to xsconsole and shell prompt!

 

So this explains why Xen4 acts as Xen1 on XenCenter.

 

Xen3 and Xen1 seem to be confused what their mgmt IP is. 

 

The question is how can this be fixed. Rebooting is not an option at this time as these hosts run critical workloads, plus some hyperconvergence storage providers.  They need to be shut down in a controlled fashion. Is there a way to modify network settings locally on each host? My impression is that using xe commands is difficult as it considers these things a centrally managed by the pool master.

 

Any ideas where to start?

 

PS: this is on XenServer 6.5. Yes I know. Long story.

 

 

 

 

Link to comment

8 answers to this question

Recommended Posts

  • 0

I would probably start with verifying time sync is good across all servers then make sure on the master in /etc/xensource/pool.conf 

it contains the word master and on the slave hosts they all contain the slave:<ip address of master>. Then I would do a xe-toolstack-restart

on the master first, then on all of the slaves and see if it all comes back up proper. If not, I would then go with xsconsole and do and 

emergency network reset on the management interfaces of the ones that are incorrect. I would probably do that with the host evacuated

as it will likely require a restart afterwards.

 

--Alan--

 

  • Like 1
Link to comment
  • 0

Thanks Alen.

 

- /etc/xensource/pool.conf on pool master contains the word 'master' 

- /etc/xensource/pool.conf on pool slaves contain the string 'slave:xx.xx.xx.52' which is the IP of the master

 

Will do the toolstack restart later today when I have at least iLo access to the boxes.

 

Question regarding emergency network reset. How can I do this on the management interfaces only without everything being wiped and the server rebooted? At least that's what xsconsole threatens me with when I choose this option.

 

Best,

Dan

 

Link to comment
  • 0

xe-toolstack-restart will be safe to do while running production. Unfortunately the emergency network reset 

will require a server reboot. I'm hoping just a toolstack restart will square everything away for you.  If not a 

reset of the networking and a restart of the host is the next step.

 

--Alan--

 

Link to comment
  • 0

The emergency network reset will be OK if you don't have any custom iSCSI settings or other customizations not in the pool metadata.

If brought up, it should repopulate itself.

 

You can also try "xe pool-sync-database" and see if that helps; as Alan mentioned, make sure all hosts are properly synchronized to NTP.

 

-=Tobias

 

Link to comment
  • 0

Thanks Alan and Tobias.

 

Properly (and most important: calmly!) doing the toolstack restart on each slave brought back the pool to life. The reason why one of the slaves identified itself with the wrong host name in xsconsole was that there were still original (and now incorrect) A records in DNS. Apparently XenServer hosts do a reverse lookup of their own IP address to determine the host name to display in xsconsole? Interesting.

 

Again, thanks to both of you.

 

PS: All the best to you Tobias. You will be missed in here :-(

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...