Jump to content
  • 0

fresh installed 6.5 server fails to join pool


Alberto Sierra

Question

Posted

Hey folks,

 

I've been trying to find a solution for my problem in this forum for about a week and finally I have decided to ask.

A couple weeks ago one of the servers of our pool had a network card failure. It was a master and I went through the painful process of emergency transition SR-recovery and VM migration. Currently i'm running on a pool with a single server.

The failed server's nic card was replaced but after having problems with the network ordering I re-installed it fresh. Configured eth0 as the management interface and joined the pool. The next thing that happens is that the joined server loses all interfaces and connectivity and I need to remove it from the pool (host-forget, remove db on the slave, restart networking).

I've repeated this process several times with the same result. The new server loses all network and needs to be reset for the NICs to be visible again.

 

There are a few things to notice:

The master has a bonded management interface

Right after the server joins the pool, there is an "unknown" interface configured with the IP address originally configured in the management interface (eth0) and the management interface is not configured and not attached to any network.

Right after the server joins the pool, all NICs show as disconnected.

Finally, the weirdest thing is that the new server's password is reconfigured (by the master?) with the original root password from the failed server.

 

Any ideas how to get this host back into the pool?

Recommended Posts

Posted
22 minutes ago, Tobias Kreidl said:

The server to add should have just one management interface present -- even if it's eventually become a bond on the pool to which it is being added. Make sure it was properly ejected from the original pool, so that there is no trace of it still existing before adding it back in.  You might need to run the xe command "xe host-declare-dead uuid=(UUID-of-host)".

 

-=Tobias

 

Thanks Tobias,

that xe command fails after I have removed the the server from the pool and reset its network to default. Is there any other command I can use to make sure the master is the only server in the pool database? the password thing makes me think there are still some traces of the failed server in the pool db.

 

 

 

Posted

Thanks Alan,

 

Since the only thing I didn't check was the patch level, I went to the master and did:

# xe patch-list |grep name-label
              name-label ( RO): XS65E003
              name-label ( RO): XS65E006
              name-label ( RO): XS65E002
              name-label ( RO): XS65E005
              name-label ( RO): XS65E001
              name-label ( RO): XS65E008
              name-label ( RO): XS65ESP1
              name-label ( RO): XS65E007

While the new server didn't show anything. After installing SP1 on the new server, I tried to join the pool and now I get the error:

Quote

This server is a different version to the master

 

If i try the same command on the new server, now I get the same list as the master in the pool. Interestingly, from xencenter, the master doesn't show any updates installed, just the original base install.

Posted

The server to add should have just one management interface present -- even if it's eventually become a bond on the pool to which it is being added. Make sure it was properly ejected from the original pool, so that there is no trace of it still existing before adding it back in.  You might need to run the xe command "xe host-declare-dead uuid=(UUID-of-host)".

 

-=Tobias

Posted

Not really, thats the process. As long as the nic ordering is the same, the slave to be added is on the same hotfix level as the 

pool, and yes, you just configure a singe eth0 and once the host joins it will inherit all of your pool bonds.  Make sure time

is synchronized between the two servers. 

 

--Alan--

 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...