Jump to content
Welcome to our new Citrix community!
  • 0

Non persistent vmware vsan issue - Cannot open the disk


Tyler Dickey1709161241

Question

Hello,

 

We noticed that there is an issue on our vmware vsan environment for non persistent machines. When a single host is put into maintenance mode, or taken offline we are unable to power on non persistent machines. The ones running however are still accessible / usable.

 

Error message:

 

Cannot open the disk '/vmfs/volumes/vsan:52c791eced213298-dc6e3740d4a20d98/bde23d5f-949a-8a93-3f74-246e96b66c6c/NOTREALNAME-xd-delta.vmdk' or one of the snapshot disks it depends on.

 

Environment

Delivery controller: 1912 LTSR CU1

VDA: Same as delivery controller

VMware hosts: VMware ESXi, 6.7.0, 15160138

VSAN: Raid 5 configuration (we think this might be causing the issue, but don't know why). Version: 6.7U3 vsan, and raid 5 with dedup/compression

 

Link to comment

2 answers to this question

Recommended Posts

  • 0

Just in case anyone is wondering, this is caused by the setting "force provisioning" under datastore settings. disabling that setting fixed this issue, though from my understanding if we had 5 nodes instead of 4 that would also have fixed it (unless 2 hosts went offline). Force provisioning ignores the vmware resource requirements but allows the machines to power on.

  • Like 1
Link to comment
  • 0

Reply from VMware below, really trying to get this figured out and very surprised no one else has seen this issue. Always make sure you test the redundancy of your environment ; ). 

 

My concern is not its process, persistent or not this does not affect vSAN space consumption differently. What concerns me, is it does not appear to be creating a vSAN object based from the data in VMware log. Which means it would be reliant on the free space available from the namespace object (for all files, VMX, redo logs, flat files, VMware logs - etc). Which has a max consumption of 255GB for files inside. 


This is why we would need to verify on the call, however looking at the VMware log - it is noting information that leads to that hunch. 

2020-09-04T19:28:12.021Z| vmx| I125: DISKLIB-LIB_CREATE : CREATE CHILD: "./WDRSXDRAMOEA007-57d4aab5-8414-4d24-88dc-7fc0f751ce57-xd-delta.vmdk.REDO_4whaqa" -- vmfsSparse cowGran=0 allocType=0 policy='(("stripeWidth" i1) ("cacheReservation" i0) ("proportionalCapacity" i0) ("hostFailuresToTolerate" i1) ("forceProvisioning" i0) ("spbmProfileId" "c6ba463b-b90e-42ee-b970-4da2ca5d2dae") ("spbmProfileGenerationNumber" l+0) ("replicaPreference" "Capacity") ("iopsLimit" i0) ("checksumDisabled" i0) ("spbmProfileName" "DONOT USE - RAID 5 Default Policy"))'

From this output, we can see the child object / re-do log being created here - is a 'VMFSsparse'. 
 - VSAN uses vSAN sparse (VMFS sparse is a traditional object). 
Where it has the policy tagged as 'DONOT USE raid 5'. 
 
Where during the boot cycle we also see a 'flat file' - which would be VMFS/Traditional block format. 

2020-09-04T19:28:11.890Z| vmx| I125: DISKLIB-VMFS : "/vmfs/volumes/vsan:52c791eced213298-dc6e3740d4a20d98/bde23d5f-949a-8a93-3f74-246e96b66c6c/WDRSXDRAMOEA007_IdentityDisk-flat.vmdk" : open successful (524293) size = 16777216, hd = 0. Type 3
at th

*My theory, of what is happening, has to do with the namespace reaching 250GB+ of used space. 
Now why this happens as a host is in maintenance mode, vs not - I am not entirely sure if a increase would happen from the Citrix side - but this is what I anticipate to be happening off the data I have at this time. 

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...