Jump to content
Welcome to our new Citrix community!
  • 0

Layer Service - Can't find the volume


Kyle Loos

Question

We are rolling out layered images to a subset of our users (about 30) and about half of those users get an elastic layer assigned to them. Looking through the ulayersvc.log files, I am seeing errors that concern me. Here is an example:

 

2017-11-15 17:22:01,242 WARN 48 LayerConfigService: Can't find the volume UDiskP4D0003V0R1 @ \\?\Volume{93a82285-ca05-11e7-80ce-0050569768d2}\.  It will be reattached on the next login.

 

Corresponding System Event Viewer message

2017-11-15 17:22:31 Event ID: 51 Source: Disk An error was detected on device \Device\Harddisk2\DR2 during a paging operation

 

This happens on a fairly regular basis each evening and generally between 5:00 PM and 9:00 PM. Most of the users are gone at that point and may have logged off or may have just disconnected. But, I am seeing Event Viewer messages that indicate that these disconnects are happening to active users. But, I never see these disconnects during the day: 6:00AM - 5:00 PM. They only happen in the evening but not a specific time.

 

The server that houses the elastic layer share is a Windows 2016 Server that does nothing else. No anti-virus, or monitoring software are installed and this server does no backups. I don't see any events in VMware that would correspond. I see no events on the file server that correspond. All of our servers are in the same datacenter with 10 GB connectivity. I have started a case with Citrix but they are at a loss.

 

Environment:

Hypervisor - VMWare 6

XenApp 7.15

App Layering 4.5

 

If anyone has any insight or suggestions, I am open to them. 

 

 

 

Link to comment

23 answers to this question

Recommended Posts

  • 0

I am going to upgrade to App Layering 4.6 as we have seen some STOP errors related to the Unidesk filter driver in 4.5. We will see if that makes any difference. As a test, I did put one of my Citrix servers in maintenance mode and manually attach a VHD from the elastic share. So far, after 3 days, it is still connected.

Link to comment
  • 0

Hi,

 

did you ever get a resolution for this? we are seeing the exact same behaviour. We have a support case open but after a couple of days nothing seems to be forth coming.

 

This happens on a VM with no AV, and there are defiantly no background processes (Backup etc) running.

 

We will keep pushing support, but i'm no confident of a timely resolution based on our communications so far.

 

thanks

Link to comment
  • 0

Hi Phil - I never did find a resolution for this and simply gave up on elastic layers. Citrix insisted it was "something happening on our network" that caused the volumes to drop yet it would never happen during the busiest portions of the day. Frankly, when they dropped elastic layer support for Microsoft Office, I had no real desire to pursue E.L. anyways. We still use the layering appliance, we just don't use E.L.

Link to comment
  • 0

One test you can try to narrow the issue down between the desktop vs network and storage is to mount a vhd from a nin app layering vm and see if it also gets disconnected eventually.  
 

if it doesn’t it could be something on the desktops  causing the disconnect. Just realize the connection to the vhd has nothing to do with the app layering filter driver.  Its solely  a windows function. 

Link to comment
  • 0

we are still trying to get to the bottom of this, and support dont seem to have any suggestions other that its something on the network. We are seeing this on an greenfield deployment, if this is environmental then its been replicated by 2 different organisations, which at worst points to a product issues, at best an incompatibility that is not documented. 

 

this does not happened if we manually mount a .vhd, or if the same .vhd, from the same SMB share is mounted to a single session OS (server vdi)

 

Does anyone actually have this working in production? I'm not convinced this actually works as product.

Link to comment
  • 0

yes elastic layering on server 2016? we are seeing frequent detaches of disks, across 2 independent environments.

 

We can mount a .vhd and this stays connected, we can use the same layer from the same smb share for server vdi and this is fine. Once we use mutli session we start to see error like this in the layering log file.

 

2020-03-02 11:05:51,105 WARN 7 LayerConfigService: Can't find the volume UDiskP3D8002V0R4 @ \\?\Volume{682eed13-0000-0000-0000-100000000000}\.  It will be reattached on the next login.

2020-03-02 11:07:51,108 WARN 7 LayerConfigService: Can't find the volume UDiskP3D8008V0R1 @ \\?\Volume{0007f342-0000-0000-0000-100000000000}\.  It will be reattached on the next login.

2020-03-02 11:07:51,111 WARN 7 LayerConfigService: Can't find the volume UDiskP3D800FV0R1 @ \\?\Volume{000c69bd-0000-0000-0000-100000000000}\.  It will be reattached on the next login.

2020-03-02 11:58:51,280 WARN 14 LayerConfigService: Can't find the volume UDiskP688001V0R1 @ \\?\Volume{a4dc1710-0000-0000-0000-100000000000}\.  It will be reattached on the next login.

2020-03-02 11:58:51,284 WARN 14 LayerConfigService: Can't find the volume UDiskP3D8004V0R3 @ \\?\Volume{85832689-0000-0000-0000-100000000000}\.  It will be reattached on the next login.

 

this is whilst the users is logged in, this then stops the users using any applications that were previously working.  A log out and back in again often fixes this but by this cause quite a bit of disruption.

 

Attached is what we see in the layering log, and the corresponding system log.

 

disk missing.PNG

disk errors event logs.PNG

Link to comment
  • 0

You dont have any VSS backups running on the VDA do you? 

Also I want to make sure you are not using the same elastic layer share for two app layering appliances?  Had a customer recently doing that and it does cause different sets of issues due to permissions.

I have not really seen what you are reporting before.  I have seen issues before with user layers on Nutanix and some type of kerberos timeout they had.,

Link to comment
  • 0

we dont have any VSS backups of the VDA and only a single appliance. What worries me is that support have said they have never seen it before, but we are seeing the same behaviour in two environments, i cant see anything that been done that deviates from best practice. User layers seem fine its only Elastic layers. We have noticed the SMB share is opened as the user and seems to stay open from quite some time  (several hours) even after the user is logged out. Its there any process on the VDA relating to app laying that keeps this open? 

Link to comment
  • 0

On terminal servers layers are never closed/unmounted.  When you reboot they will get disconnected.

 

You aren't using Server 2016 for VDI single user are you?  You can do that but you have to make sure the VDA reboots after logout.

 

And I agree with support this is not something i have seen as an issue before.

 

Link to comment
  • 0

we are using server 2016 for VDI in some instances, but these reboot on logout and do not cause us any issues. 

 

It is specifically the 2016 XenApp\RDSH hosts, if the layers are never dismounted, and the server hosing the SMB share sees connections to the .vhd as each user that has logged into the RDSH server, is it conceivable that some process running with the required privileges could dismount these? as the user is no longer connected to the RDSH server but the SMB share is still open as that user?

Link to comment
  • 0

Hi,

 

We have finally made some progress on this, we removed app layering as such from the equation and ran this test 

 

1.    Log onto a server (not using VDA or App layering) 
2.    Connect .vhd from file share
3.    Log off server
4.    Check file server to see the .vhd is still open as that user even though there is no session on the server
5.    Confirm Idle timer is reset every 10 mins
6.    Check back after over 10 hours and .vhd will be dismounted
7.    This will match to an anonymous access log on the file server as the ticket has expired

 

Support kept pushing us to look at SMB timers etc, which we kept explain were set at the defaults of 15mins, so it this was taking effect then this would door much sooner…

We did notice that the .vhd stayed connected, and the idle timer of the connection on the file server never reached 15mins, in fact this was reset every 10 mins. Out assumption it this must be a service impersonating the user, as the user is physically not logged on to that sever. Once 10 hrs is reached and the “service” tries to keep the SMB connection alive, this fails and the connection is disconnects, and you will see and “Anonymous login” event on the file server. This suggested to us that this was related to “Maximum lifetime for service ticket” We increased this value to 24 hours (as we reboot our machine overnight) and this has now removed the problem.

 

It appears that the first user to login the software layers are attached for that user, and even after the log off this is the access method for all users on the server, on the service ticket expire this disconnects all users from the .vhd.

Support have told me they can’t replicate this, however I have managed to do this every time, and it’s also worth noting that the default for “Maximum lifetime for service ticket” is 10 hours (Default: 600 minutes (10 hours)) so this is added to default domain policy on every single windows domain, so I’m not sure how it either can’t be replicated or I am the only person to have ever seen this issues.

 

Whilst the above does work, it does decrease security on the domain and some people might have an issues with that, however in our case increasing this from 10 -24 hrs is not that big a problem. If you were only doing weekly reboots then  have a service ticket lifetime of a week would be a real problem

I could be wrong with the above as support can’t really give me a straight answer, but it all seems to fit, and since we made the change this has resolved the issues for us completely. I also appreciate that this is not “app layering” as a such, but its relaying on underlying windows tech and we could not find any mention of this as an issue, in any support documentation. If this is the case it seem to be a weak point in the product that it uses the first users access token to connect the .vhd and not the actual logged in user which might prevent this happening?

 

If anyone has any other ideas or, anything above is not correct then we are happy for suggestions!

Thanks

Phil

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...