Jump to content
Welcome to our new Citrix community!
  • 0

Issue with host which is and is not in a pool


Ricardo Mendes1709159988

Question

Hi guys,

Have a Pool with 4 hosts on 8.2. I was doing some changes to the servers physical configuration (removing their hard drives and adding new in RAID1 config).

I did this to every server on the pool and left the master to the end. Then I went to `xsconsole`, changed the master, to a server that had already been intervened. I connected to that server, the pool was working well. Removed the previous master from the pool and removed it from Center.

 

I did what I had to do just like to all the other servers. Finally I went to add the server back to the pool, it proceeded but the server didn't show on the list. I waited around 5 minutes which would be more than enough time, but nothing. I disconnected Center from the pool and reconnected, nothing. On the this server I did `systemctl reboot` to see if it would show. Nothing.


So I went to the master on CLI and ran `xe host-list` - and I only see three servers, the fourth isn't there. I restarted the toolstack and ran the command again, the same result.

I connected to this fourth server and ran `xe host-list` and `xe pool-list`, the result was this:

[12:50 hv1 ~]# xe pool-list
The uuid you supplied was invalid.
type: host
uuid: b9b542b2-770e-4ea6-a17b-074d1ddc54ed
[12:50 hv1 ~]# xe host-list
The uuid you supplied was invalid.
type: host
uuid: b9b542b2-770e-4ea6-a17b-074d1ddc54ed

Restarted the toolstack on this server with the same results.

On Center I went to server > add and put the IP and the credentials. I get an error saying "Server IP is a member of pool 'pool name' and is already connected.

`xe host-list` still returns three servers only, not this one. 

 

So I opened `xsconsole` on this server, and noticed there is no Management Interface selected. Running `ip add` shows interface `xenbr0` and the IP is correctly configured (I am accessing via ssh).

When in `xsconsole` on the server that is having this issue I go to Network and Management Interface > Configure Management Interface and under "Management Interface Configuration" says "No interfaces present" and the option Display NIC's shows nothing.

Emergency Network Reset asks for network configuration, and says the server is a slave and asks to specify the pool master and gives the correct IP of the master. But when I proceed it reboots, the error maintains, no management interface appears, nothing.

 

On the current master if I try to remove the host I get an error saying the UUID is not valid:

# xe host-forget uuid=b9b542b2-770e-4ea6-a17b-074d1ddc54ed --force
The uuid you supplied was invalid.
type: host
uuid: b9b542b2-770e-4ea6-a17b-074d1ddc54ed

The server that is not working is constantly displaying this error:

Broadcast message from systemd-journald@hv1 (Fri 2021-09-17 14:20:52 WEST):
xapi-nbd[7933]: main: Failed to log in via xapi's Unix domain socket in 300.000000 seconds

Broadcast message from systemd-journald@hv1 (Fri 2021-09-17 14:20:52 WEST):
xapi-nbd[7933]: main: Caught unexpected exception: (Failure

Broadcast message from systemd-journald@hv1 (Fri 2021-09-17 14:20:52 WEST):
xapi-nbd[7933]: main:   "Failed to log in via xapi's Unix domain socket in 300.000000 seconds")

 

The command `xe pool-sync-database` which should fail if a slave isn't accepting a connection but it runs on the current master and displays no errors. From what I understand if the server did belong to the pool it should return `You attempted an operation which involves a host which could not be contacted.`

 

So if the host is not part of the pool, how is it and not at the same time? It's like a quantum state but I am observing!

I appreciate any help, thank you!

Link to comment

9 answers to this question

Recommended Posts

  • 0

Hi Tobias thank you for your reply. I changed the master using xsconsole to another host. I ejected the master from the pool and removed it from XenCenter.

So the master role was properly assigned to a new server. That went without issues. The new master does have the role of master.

 

My problem happened when I re-added the previous master to the pool after making the necessary interventions. The drive was wiped and the system was installed from scratch, like was done with all the others.

 

When I run `xe pool-recover-slaves` on the master only returns 2 UUID's (of the slaves already in the pool), the UUID of the server that's stuck isn't showing.

Link to comment
  • 0

Hi Alan thank you for your feedback. I didn't see any error when adding to the pool. It simply added the SR's that ended up failed, namely "DVD drives", "Local storage" and "Removable storage". The host never showed on the pool, and no error was displayed. I waited around 5 minutes (usually the host gets in the pool in maybe less than 1 minute) so I just gave it time, but nothing.

 

I took the UUID's of those SR's and ran `xe sr-list` doesn't return those SR's, `xe pbd-list sr-uuid=sr_uuid`  returned empty, so I ran `xe sr-forget uuid=sr_uuid` to remove their reference from the pool and that worked.

 

Now when I try to add this host to the pool it gives this error:

"Server 'A.B.C.D' is a member of pool 'POOL-1' and is already connected."

 

(but is not, and is not)

Link to comment
  • 0

Hi Tobias,

 

So on the master there was no reference to the host aside from the SR's I removed but they had no PBD's associated nothing. Trying to remove it from the pool wasn't working cause it said the UUID was invalid.

I am currently reinstalling the OS on this host and will try to add it again afterwards.

 

All the servers are synced with NTP. Thanks.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...