Jump to content
Updated Privacy Statement
  • 0

XenServer + Dell Compellent +Multipathing


ABICO Licensing

Question

Hi all -

 

I have a problem similar to the one discussed here:

 

https://discussions.citrix.com/topic/397933-xenserver-add-session-path/

 

but it's not quite the same.

 

I have eight XenServers in a pool, connected to a pair of Dell Compellents… each Compellent has four iSCSI interfaces, they are configured in replication so effectively present as unit. Each Compellent serves six SRs. The pool is multipathed to the Compellent. Five of the SRs are old, one is new., The "problem" is with the new one.

 

I am able to scan/attach absolutely fine, however for the five old SRs I see:

 

4 of 4 paths active (4 iSCSI sessions)

 

whereas on the new SR I see:

 

4 of 4 paths active (2 iSCSI sessions)

 

mpathutil status confirms four active sessions:

 

An example old SR:

36000d31000c8ac000000000000000003 dm-6 COMPELNT,Compellent Vol

size=1.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=enabled
  |- 8:0:0:1  sdh  8:112  active ready running
  |- 9:0:0:1  sdk  8:160  active ready running
  |- 10:0:0:1 sdp  8:240  active ready running
  `- 21:0:0:1 sdv  65:80  active ready running

 

The new SR:
36000d31000e17e00000000000000004f dm-15 COMPELNT,Compellent Vol
size=1.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 21:0:0:7 sdab 65:176 active ready running
  |- 7:0:0:7  sdag 66:0   active ready running
  |- 6:0:0:7  sdaf 65:240 active ready running
  `- 10:0:0:7 sdae 65:224 active ready running

 

The only odd thing I've found (other than 4 vs 2) is:

 

[root@xen01 iscsi]# iscsiadm -m node -p 10.0.15.45 --rescan
Rescanning session [sid: 15, target: iqn.2002-03.com.compellent:5000d31000e17e22, portal: 10.0.15.45,3260]
Rescanning session [sid: 16, target: iqn.2002-03.com.compellent:5000d31000e17e25, portal: 10.0.15.45,3260]
iscsiadm: invalid error code 65280
iscsiadm: Could not execute operation on all sessions: (null)
 

I get the same result of any of the Compellent's IPs.

 

Related, I have tried attaching to both the "primary" and "secondary" Compellent with the same result.

 

Finally, I will mention that this install predates me, and I notice NO custom config in the multipath.conf as I'd expect for the Compellent. However, seeing as six of seven SRs have been working for literally years, whatever cause for concern here must be minor.

 

I cannot understand what causes the new SR to have fewer sessions to the *exact same* iSCSI target. TBH, I'm not sure it matters! I just would like to understand the discrepancy more than anything.

 

Hopefully someone can help... as you might imagine I'm out of ideas. :(

Link to comment

Recommended Posts

  • 0

When it comes to backup, I'm willing to pay for a product. Of course, what I'm willing to do and what the people who actually have the money are willing to do aren't always in alignment. ;)

 

As far as iSCSI sessions go, still no luck... adding the LUN still creates four paths across two sessions. :( 

 

For grins I went ahead and moved the pool master to one of the "spare" machines where the new multipath.conf was created, but still only two sessions.

 

I am at a loss - I don't understand how one iSCSI target can have different results between different LUNs... Eh, I have one more thing to try. Still, generally, at a loss, though. :)

 

Edit: No, that was not it.

Link to comment
  • 0

I only did the discovery process (sendtargets, I assume we're talking about) on the pool master, which also has the updated multipath.conf. Ultimately, if it had connected fine and the others hadn't that would have been something, but no improved results from anything. I did pick the wildcard target.

 

What is making this difficult is that I have multiple working LUNs  and one non-working LUN on the target, so iscsiadm -m session shows everything is fine. I don't see a way to show sessions per LUN. If I knew which chassis/controller/interface was broken, I would know where to look to solve the issue!

 

I am doing the Add SR via XenCenter. I don't know the command line for it, but maybe that's an approach?

Link to comment
  • 0

The iscsiadm process should IMO be run on each and every host. I've had to do this a number of times as the only way to get each one to eventually behave and see all paths activated. 

 

Adding the SR via XenCenter should be fine, just pick the wildcard interface when it asks what controller interface to connect to.  Make sure of course that multipath is configured correctly and the same on each hosts and is enabled. Just as a handy guide to iscsiadm procedures, one good resource is this: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/5/html/online_storage_reconfiguration_guide/scanningnewdevs-iscsi

 

-=Tobias

Link to comment
  • 0

Yeah, that all makes sense... but would you agree that running the discovery process on only the pool master and then adding the SR on the pool master should (at least) net a proper connection on just the pool master? 

 

When I first started this adventure, I had a dozen putty windows open and ran a script for discovery which each (experimental) pass. That is really, annoyingly time consuming so in the last several iterations I've stopped doing that. :|

Link to comment
  • 0

Gotcha, makes sense.

 

So here's what may be a final question... I've been reading all the piecemeal iscsiadm documentation and I can use:

 

iscsiadm --mode session -r [target] --op new

 

(followed up with:

iscsiadm -m node -T [target] -p [ip] --op update -n node.session.nr_sessions -v  [# of sessions] 

to make it "stick")

 

to manually start a new session to a given iSCSI target. Are there guidelines here?

Link to comment
  • 0

Some other stuff, as it's come up... 

 

 iscsiadm -m session --rescan

 

shows what I expect it to.

 

Then,

 

 iscsiadm -m session --sid=[SID] -P 3

 

shows something interesting. There are eight SIDs associated with the Compellent - 2 chassis x 2 controllers x 2 interfaces = 8 SIDs.

 

All of the previously existing LUNs appear on four SIDs... eg, LUN1 appears on SID 1, 2, 12, 14.

 

The NEW LUN appears on only three SIDs - 2, 4, 12. It should also appear on SID 14.

 

SID 12 is Compellent #2, bottom controller, 1st interface

SID 14 is Compellent #2, bottom controller, 2nd interface

 

Given that it'd be insane to be a network problem (all the other LUNs on SID 14 are up...) I can't help but wonder if this is a time issue. I don't really know why or how... but maybe it takes time for the Compellent to present a LUN on an interface? Maybe all my adding and deleting has never included "wait X hours for LUNs to appear?"

 

Also, can anyone comment on:

 

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/storage_administration_guide/iscsi-scanning-interconnects

 

Specifically this bit:

 

Important

The sendtargets command used to retrieve --targetname and --portal values overwrites the contents of the /var/lib/iscsi/nodes database. This database will then be repopulated using the settings in /etc/iscsi/iscsid.conf. However, this will not occur if a session is currently logged in and in use.

To safely add new targets/portals or delete old ones, use the -o new or -o delete options, respectively. For example, to add new targets/portals without overwriting /var/lib/iscsi/nodes, use the following command:

iscsiadm -m discovery -t st -p target_IP -o new

To delete /var/lib/iscsi/nodes entries that the target did not display during discovery, use:

iscsiadm -m discovery -t st -p target_IP -o delete

You can also perform both tasks simultaneously, as in:

iscsiadm -m discovery -t st -p target_IP -o delete -o new

 

as it applies to XenServer?

 

And, sorry for the stream of consciousness but here's another oddity - if you DETECH and FORGET an SR, it stays in the iSCSI session DB for at least a couple hours:

 

[root@]# iscsiadm -m session --sid=12 -P 3

 

                Host Number: 17 State: running
                scsi17 Channel 00 Id 0 Lun: 1
                        Attached scsi disk sdab         State: running
                scsi17 Channel 00 Id 0 Lun: 3
                        Attached scsi disk sdac         State: running
                scsi17 Channel 00 Id 0 Lun: 5
                        Attached scsi disk sdad         State: running
                scsi17 Channel 00 Id 0 Lun: 7
                        Attached scsi disk sdae         State: running

LUN 7 was deteched and forgotten quite some time ago!

Link to comment
  • 0

The relevant LUN is not shown in multipath -ll

 

multipath.conf has been updated on all hosts.

 

Oddly, rescanning targets and showing LUNs gives different results on different machines. The machine I've been primarily working with shows LUN 7 on 3 SIDs. Another machine shows LUN 7 on only 2. 

 

I think I may have to throw in the towel and brave Copilot!

 

Can anyone talk to me about /etc/multipath/wwids:

 

# Multipath wwids, Version : 1.0
# NOTE: This file is automatically maintained by multipath and multipathd.
# You should not need to edit this file in normal circumstances.
#
# Valid WWIDs:
/36000d31000c8ac000000000000000008/
/36000d31000c8ac000000000000000006/
/36000d31000c8ac000000000000000004/
/36000d31000c8ac000000000000000003/
/36000d31000c8ac000000000000000005/
/36000d31000e17e00000000000000004f/
/36000d31000c8ac000000000000000007/
/36000d31000c8ac000000000000000055/
/36000d31000c8ac000000000000000058/
 

It shows a bunch of NLA LUNs....  I wonder if this is causing an issue?

 

 

Link to comment
  • 0

I am now taking the scorched earth approach:

 

Kill iSCSI sessions:

iscsiadm -m node -T [IQN] -p [IP]:[PORT] -u

 

Kill nodes so sessions don't restart:

iscsiadm -m node -o delete -T [IQN]

 

Check to be sure there are no sessions:

iscsiadm -m session

 

Remove targets from iSCSI db:
iscsiadm -m discoverydb -t sendtargets -p [IP] -o delete

 

Verify there are no remaining iSCSI targets:
ls /var/lib/iscsi/nodes

 

Reboot the server.

 

Start over.

 

Somewhat hilariously, even after nuking everything off this server, XenCenter still shows:

 

2 of 3 paths active (2 iSCSI sessions)

 

LOL.

Link to comment
  • 0

NOOOOOPPPPPEE.

 

4 of 4 paths active (2 iSCSI sessions)

 

[root]# multipath -ll
36000d31000c8ac000000000000000058 dm-0 COMPELNT,Compellent Vol
size=1.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=enabled
  |- 6:0:0:7  sde  8:64   active ready running
  |- 10:0:0:7 sdo  8:224  active ready running
  |- 16:0:0:7 sdae 65:224 active ready running
  `- 18:0:0:7 sdag 66:0   active ready running
 

And it appears to see LUN 7 on four sessions....

 

[root]# iscsiadm -m session --sid=1 -P 3 | grep Lun
                scsi6 Channel 00 Id 0 Lun: 2
                scsi6 Channel 00 Id 0 Lun: 4
                scsi6 Channel 00 Id 0 Lun: 6
                scsi6 Channel 00 Id 0 Lun: 7
[root]# iscsiadm -m session --sid=5 -P 3 | grep Lun
                scsi10 Channel 00 Id 0 Lun: 2
                scsi10 Channel 00 Id 0 Lun: 4
                scsi10 Channel 00 Id 0 Lun: 6
                scsi10 Channel 00 Id 0 Lun: 7
[root]# iscsiadm -m session --sid=11 -P 3 | grep Lun
                scsi16 Channel 00 Id 0 Lun: 1
                scsi16 Channel 00 Id 0 Lun: 3
                scsi16 Channel 00 Id 0 Lun: 5
                scsi16 Channel 00 Id 0 Lun: 7
[root]# iscsiadm -m session --sid=13 -P 3 | grep Lun
                scsi18 Channel 00 Id 0 Lun: 1
                scsi18 Channel 00 Id 0 Lun: 3
                scsi18 Channel 00 Id 0 Lun: 5
                scsi18 Channel 00 Id 0 Lun: 7
 

Link to comment
  • 0

Seems I was not patient enough, and/or not reboot happy enough. After purging and recreating the node database and cleaning up the wwid database and rebooting a second time, the SR comes up correctly:

 

4 of 4 paths active (4 iSCSI sessions)

 

I know I need a beer, and I owe you two a beer. I dunno how I get that across the internet, but if you tell me I'll do it. THANK YOU.

 

Tangentially related, my conversation with Dell netted this documentation:

https://downloads.dell.com/solutions/storage-solution-resources/3132-CD-V_Configuration Guide for XenServer with SC Storage.pdf

 

*Nowhere* does it mention a multipath.conf entry, and both gentlemen on the phone had nothing to offer. I'm not reading *anything* into that, I just find it *weird.*

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...