Jump to content
Welcome to our new Citrix community!
  • 0

The VDI is not available - The specified storage repository scan failed.


kashif imran

Question

HI!

Kindly help me out. I am new to Citrix Xenserver. We have Xenserver 6.0.2 free version installed on multiple hosts. We have 3 HBA Storage Repositories. Everything is fine a week before. One day one of my VM automatically shutdown and when i try to start the vm, i got this error " The VDI is not available ". VM is stored on one of a Storage Repository. Before I able to resolve this issue I got the same issue with my another machine, VM is shutdown and when i try to start the vm, i got this error " The VDI is not available " this VM is stored on another storage repository.

When I try to rescan any of my storage, i got this error: " The specified storage repository scan failed."

After visiting different forums, I run different commands to check the SR. Below are the outputs of different commands:

cat /var/log/SMlog | more
its output is attached in SMLog file

ls /var/run/sr-mount/a95b1972-9f78-74ac-c31d-7bba6045bef
ls: /var/run/sr-mount/a95b1972-9f78-74ac-c31d-7bba6045bef: No such file or directory

This SR error is now with all my storage Repositories. kindly help me out. Thanks in advance.

Edited by: kashifimran27 on Oct 1, 2013 5:57 AM

Edited by: kashifimran27 on Oct 1, 2013 10:51 PM

SMLog.txt

Link to comment

Recommended Posts

  • 0

we need the sr-uuid for which the pv is missing . or simply grab it from the xencenter. next open the file named VG_Xenstorage-< sr-uuid in question> in the /etc/lvm/backup using any editor you can get the uuid of the pv and get the info about the pv (we need that to create the pv )

and then follow from the step2. pvcreate in that article and it should be all good.

  • Like 1
Link to comment
  • 0

Also, When I scan the SR, output is as bellow: and this output is same for all the SRs.

xe sr-scan uuid=e913dc5f-4e75-d195-a7fa-e9a495c4ea51
Error code: SR_BACKEND_FAILURE_40
Error parameters: , The SR scan failed [opterr=Command ['/usr/sbin/lvs', '--noheadings', '--units', 'b', '-o', '+lv_tags', '/dev/VG_XenStorage-e913dc5f-4e75-d195-a7fa-e9a495c4ea51'] failed (5): /dev/disk/by-scsid/36006016032c03200a4f6872fd1f2e211/sdr: read failed after 0 of 4096 at 0: Input/output error
/dev/disk/by-scsid/36006016032c03200a4f6872fd1f2e211/sds: read failed after 0 of 4096 at 0: Input/output error
/dev/disk/by-scsid/36006016032c03200a4f6872fd1f2e211/sdn: read failed after 0 of 4096 at 0: Input/output error
/dev/disk/by-scsid/36006016032c03200a4f6872fd1f2e211/sdo: read failed after 0 of 4096 at 0: Input/output error
/dev/sdb: read failed after 0 of 4096 at 0: Input/output error
/dev/sdb: read failed after 0 of 4096 at 4398046445568: Input/output error
/dev/sdb: read failed after 0 of 4096 at 4398046502912: Input/output error
/dev/sdb: read failed after 0 of 4096 at 0: Input/output error
/dev/sdb: read failed after 0 of 4096 at 4096: Input/output error
/dev/sdb: read failed after 0 of 4096 at 0: Input/output error
/dev/sdr: read failed after 0 of 4096 at 0: Input/output error
/dev/sdr: read failed after 0 of 4096 at 2362231947264: Input/output error
/dev/sdr: read failed after 0 of 4096 at 2362232004608: Input/output error
/dev/sdk: read failed after 0 of 4096 at 0: Input/output error
/dev/sdk: read failed after 0 of 4096 at 4096: Input/output error
/dev/sdk: read failed after 0 of 4096 at 0: Input/output error
/dev/sdm: read failed after 0 of 4096 at 0: Input/output error
/dev/sdm: read failed after 0 of 4096 at 6597069701120: Input/output error
/dev/sdm: read failed after 0 of 4096 at 6597069758464: Input/output error
/dev/sdm: read failed after 0 of 4096 at 0: Input/output error
/dev/sdm: read failed after 0 of 4096 at 4096: Input/output error
/dev/sdm: read failed after 0 of 4096 at 0: Input/output error
/dev/sdn: read failed after 0 of 4096 at 0: Input/output error
/dev/sdn: read failed after 0 of 4096 at 2362231947264: Input/output error
/dev/sdn: read failed after 0 of 4096 at 2362232004608: Input/output error
/dev/sdn: read failed after 0 of 4096 at 0: Input/output error
/dev/sdn: read failed after 0 of 4096 at 4096: Input/output error
/dev/sdn: read failed after 0 of 4096 at 0: Input/output error
/dev/sdo: read failed after 0 of 4096 at 0: Input/output error
/dev/sdo: read failed after 0 of 4096 at 2362231947264: Input/output error
/dev/sdo: read failed after 0 of 4096 at 2362232004608: Input/output error
/dev/sdo: read failed after 0 of 4096 at 0: Input/output error
/dev/sdo: read failed after 0 of 4096 at 4096: Input/output error
/dev/sdo: read failed after 0 of 4096 at 0: Input/output error
Volume group "VG_XenStorage-e913dc5f-4e75-d195-a7fa-e9a495c4ea51" not found
Skipping volume group VG_XenStorage-e913dc5f-4e75-d195-a7fa-e9a495c4ea51],

Link to comment
  • 0

Thanks for reply!!!

So you mean we have connectivity issue with Storage? is it a storage issue or a xenserver issue?
is there any way to resolve it without restarting the server? because i have other VMs running on that servers and on that storage as well, and they are working fine. if it is a physical server or storage issue, then my other VMs should also experience these issues.

Link to comment
  • 0

Also I try to boot one of my infected VM on another server to cross check if it is a server issue. but it didnt work. same error " The VDI is not available".
but now all the Storage repositories got disconnected from that machine. I try to reconnect all the Storage Repositories but facing attached error now.
Kindly help me out.

Edited by: kashifimran27 on Sep 24, 2013 6:24 AM

SR Error 2.png

Link to comment
  • 0

Thanks Kon for continues attention!

your required detials are as:

Fdisk -l

Disk /dev/sdg: 2362.2 GB, 2362232012800 bytes
255 heads, 63 sectors/track, 287191 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdg1 1 267350 2147483647+ ee EFI GPT

pvs /dev/sdg
Failed to read physical volume "/dev/sdg"

vgs
VG #PV #LV #SN Attr VSize VFree
VG_XenStorage-b2b751a0-fce4-6c07-1d15-2311d74415a5 1 2 0 wz--n- 127.96G 7.71G

pvs
PV VG Fmt Attr PSize PFree
/dev/sda3 VG_XenStorage-b2b751a0-fce4-6c07-1d15-2311d74415a5 lvm2 a- 127.96G 7.71G

i didnt have any wide knowledge of these commands. let me know if above inputs are enough for your understandings. Also as i mentioned earlier, i have more then one hosts connected to same storage. when i run the same command on other servers, i got different output. e.g on one of my another node, i got following output:

pvs
/dev/sdb: read failed after 0 of 4096 at 0: Input/output error
/dev/sds: read failed after 0 of 4096 at 0: Input/output error
/dev/sdt: read failed after 0 of 4096 at 0: Input/output error
/dev/sde: read failed after 0 of 4096 at 0: Input/output error
/dev/sdg: read failed after 0 of 4096 at 0: Input/output error
/dev/sdh: read failed after 0 of 4096 at 0: Input/output error
/dev/sdj: read failed after 0 of 4096 at 0: Input/output error
/dev/sdk: read failed after 0 of 4096 at 0: Input/output error
/dev/sdn: read failed after 0 of 4096 at 0: Input/output error
/dev/sdo: read failed after 0 of 4096 at 0: Input/output error
PV VG Fmt Attr PSize PFree
/dev/sda3 VG_XenStorage-3a99e81f-addc-a55d-dc4d-c7ec10e72c0f lvm2 a- 127.96G 127.96G

vgs
/dev/sdb: read failed after 0 of 4096 at 0: Input/output error
/dev/sds: read failed after 0 of 4096 at 0: Input/output error
/dev/sdt: read failed after 0 of 4096 at 0: Input/output error
/dev/sde: read failed after 0 of 4096 at 0: Input/output error
/dev/sdg: read failed after 0 of 4096 at 0: Input/output error
/dev/sdh: read failed after 0 of 4096 at 0: Input/output error
/dev/sdj: read failed after 0 of 4096 at 0: Input/output error
/dev/sdk: read failed after 0 of 4096 at 0: Input/output error
/dev/sdn: read failed after 0 of 4096 at 0: Input/output error
/dev/sdo: read failed after 0 of 4096 at 0: Input/output error
VG #PV #LV #SN Attr VSize VFree
VG_XenStorage-3a99e81f-addc-a55d-dc4d-c7ec10e72c0f 1 1 0 wz--n- 127.96G 127.96G

now I am totally out of mind. nothing click to my mind to get this issue resolved. and issue getting serious for my life now.

Link to comment
  • 0

Output is as bellow:

dd if=/dev/sdg bs=1024 count=1 | hexdump -C
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 0.000759245 seconds, 1.3 MB/s
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001c0 01 00 ee fe ff ff 01 00 00 00 ff ff ff ff 00 00 |................|
000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.|
00000200 45 46 49 20 50 41 52 54 00 00 01 00 5c 00 00 00 |EFI PART....\...|
00000210 82 72 74 f4 00 00 00 00 01 00 00 00 00 00 00 00 |.rt.............|
00000220 ff ff ff 12 01 00 00 00 22 00 00 00 00 00 00 00 |........".......|
00000230 de ff ff 12 01 00 00 00 ec b9 53 b1 8a 9b 86 40 |..........S....@|
00000240 81 28 44 2e 35 23 c0 38 02 00 00 00 00 00 00 00 |.(D.5#.8........|
00000250 80 00 00 00 80 00 00 00 86 d2 54 ab 00 00 00 00 |..........T.....|
00000260 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000400

And have following things in backup:

ls /etc/lvm/backup
VG_XenStorage-851a6eaf-aadf-8dc9-119a-86659a4343ef
VG_XenStorage-a95b1972-9f78-74ac-c31d-7bba6045bef5
VG_XenStorage-b2b751a0-fce4-6c07-1d15-2311d74415a5
VG_XenStorage-e913dc5f-4e75-d195-a7fa-e9a495c4ea51

one more thing, i attached a snapshot of a error with my specific server. I have three storage chunks, and all are showing the same error with this server.

srs2.JPG

Link to comment
  • 0

alright the screenshot has provided much info about the pool so two things here:

> is the Storage full of the vdisks ?
> we would need to create the pv and restore the VG on it. as there was no pv in the pvs output.
> also we can do one more thing,we can simply eject the server out of the pool and add it back to the pool not sure if this would work but we can try,).

Link to comment
  • 0

I try to unjoin and then rejoin the server, but it didnt resolve any thing.
outputs are as bellows:

ls /etc/lvm/backup/VG_XenStorage-e913dc5f-4e75-d195-a7fa-e9a495c4ea51

/etc/lvm/backup/VG_XenStorage-e913dc5f-4e75-d195-a7fa-e9a495c4ea51

ls /etc/lvm/backup/
VG_XenStorage-851a6eaf-aadf-8dc9-119a-86659a4343ef
VG_XenStorage-8b7315cb-4b3a-9113-ce62-20c3ecd863cb
VG_XenStorage-a95b1972-9f78-74ac-c31d-7bba6045bef5
VG_XenStorage-b2b751a0-fce4-6c07-1d15-2311d74415a5
VG_XenStorage-e913dc5f-4e75-d195-a7fa-e9a495c4ea51

pvs
PV VG Fmt Attr PSize PFree
/dev/sda3 VG_XenStorage-8b7315cb-4b3a-9113-ce62-20c3ecd863cb lvm2 a- 127.96G 127.96G

vgs
VG #PV #LV #SN Attr VSize VFree
VG_XenStorage-8b7315cb-4b3a-9113-ce62-20c3ecd863cb 1 1 0 wz--n- 127.96G 127.96G

Link to comment
  • 0

Thanks for helping!

But according to the article, I run this command: vgdisplay --partial --verbose
but i didnt find any PV with "unknown Device" . But there is no PV for any SR on all of my nodes. So which PV i need to recreate now. command output is as:

vgdisplay --partial --verbose
Partial mode. Incomplete volume groups will be activated read-only.
Finding all volume groups
Finding volume group "VG_XenStorage-8b7315cb-4b3a-9113-ce62-20c3ecd863cb"
--- Volume group ---
VG Name VG_XenStorage-8b7315cb-4b3a-9113-ce62-20c3ecd863cb
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 5
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 2
Open LV 1
Max PV 0
Cur PV 1
Act PV 1
VG Size 127.96 GB
PE Size 4.00 MB
Total PE 32758
Alloc PE / Size 30783 / 120.25 GB
Free PE / Size 1975 / 7.71 GB
VG UUID JEXBfC-na57-auS2-hpPZ-Fhee-pAnk-TR1ti6

--- Logical volume ---
LV Name /dev/VG_XenStorage-8b7315cb-4b3a-9113-ce62-20c3ecd863cb/MGT
VG Name VG_XenStorage-8b7315cb-4b3a-9113-ce62-20c3ecd863cb
LV UUID tF7qBp-3C0D-SkDx-9wqo-batN-u2PD-2nfxj4
LV Write Access read/write
LV Status available
# open 0
LV Size 4.00 MB
Current LE 1
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 1024
Block device 252:0

--- Logical volume ---
LV Name /dev/VG_XenStorage-8b7315cb-4b3a-9113-ce62-20c3ecd863cb/VHD-28cd4542-4ae7-446a-89fa-835ec84189bc
VG Name VG_XenStorage-8b7315cb-4b3a-9113-ce62-20c3ecd863cb
LV UUID 17pvyQ-a1C4-kfqx-CQYW-CVgI-sIna-EWko6B
LV Write Access read/write
LV Status available
# open 1
LV Size 120.24 GB
Current LE 30782
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 1024
Block device 252:1

--- Physical volumes ---
PV Name /dev/sda3
PV UUID wIUc85-XyLW-0M2v-otgV-o46D-uVJK-5yqs0V
PV Status allocatable
Total PE / Free PE 32758 / 1975

Should I need to create PV for all the SR and on all the nodes?? .

Edited by: kashifimran27 on Sep 26, 2013 8:51 AM

Edited by: kashifimran27 on Sep 26, 2013 8:52 AM

Link to comment
  • 0

we need the sr-uuid for which the pv is missing . or simply grab it from the xencenter. next open the file named VG_Xenstorage-< sr-uuid in question> in the /etc/lvm/backup using any editor you can get the uuid of the pv and get the info about the pv (we need that to create the pv )

and then follow from the step2. pvcreate in that article and it should be all good.

Link to comment
  • 0

I experienced a similar type issue at my last place. The metadata for our VMs were getting corrupt, and not matter how fast we tried to fix one, two more got corrupt. We believe it started when our pool master experienced a random reboot. We were down the path of repairing the VDI by detaching and reattaching the write cache drive as described here.
http://support.citrix.com/article/CTX131201
but could not keep up with the propagation.

Here is what ultimately helped fix the metadata on the VMs. Keep in mind this needs to be done on all hosts before you can start machines with corrupted metadata, or it will start all over again.

" -quoted
1) On the host with issues, execute xe-toolstack-restrat
- Wait for the host to go Red and then it will come back up in maintenance mode
2) Manually make the host exit maintenance mode by right clicking on the XenCenter console
3) There should be a message asking if you want to bring the Vms back (these are the ambers), when you see this message click 'Ok'
- The pool member will appear to be ok again
4) Run xe-toolstack-restart again on the same host
- Wait for the host to come back up in maintenance mode
5) Manually exit maintenance mode
- There should be no message this time around
6) You should now see the previously amber Vms slowly getting dropped from the pool host, wait until the movement stops and do the same for all affected hosts.

Note: The amber Vms will not be able to properly power on until all affected hosts have been cleaned using this procedure.

I would also apply the OneShot mode and disable int-remmaping:
- Uncomment “ONESHOT=yes” from /etc/sysconfig/irqbalance
- Add ommu=no-intremap to the default XenServer moot.c32 file located in /boot/extlinux.conf
"

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...