I got an interesting item in my inbox from a friend who was speaking with VMware about their VDI solution. He asked me if the information VMware was telling him was true. He was especially curious because he knew I wrote the Citrix XenDesktop Enterprise Designreference architecture that VMware was referencing to talk about how much better View was. VMWare's approach is laughable. They are taking a detailed consulting design document and trying to compare it to the VMware View reference architecture, which if you read it like I have (wasted 2 hours of my life), you will quickly see it is high-level and full of marketing spin and provides no insight. I, on the other hand, was trying to provide all of you in the community with insight into how to design a large, and complex customer environment with XenDesktop. Anyways, I told him the angle they were using and he thought it was ridiculous. I was going to leave it at that, but I've been seeing and hearing more about it from others so I thought I would provide all of you with the same information. Let's break it down:
Scalability:
- Misconception: VMWare says that XenDesktop has poor hypervisor scalability. They say that on a 16 core server XenDesktop can only support 40 users (3 users per core).
- Truth: The XenDesktop reference architecture for the hosted virtual desktops is 8 cores, not 16. In the design phase, we estimated 40-50 VMs per server, which averages to 5-7 virtual desktops per core. We were a little conservative as we were not sure how the unique applications would impact the system. But you can look at Project Virtual Reality Check scalability white paper to get a good comparison of XenServer and ESX. Although the design VMWare references was for XenServer, the same estimates would have been used if the hypervisor was running ESX.
Storage:
- Misconception: VMware likes to say that XenDesktop is a storage pig in that we need a lot of storage associated with each virtual desktop.
- Truth: This particular design had a requirement to keep a few system items persistent across workstation reboots so we recommended the creation of a local, persistent disk of between 3-5GB to store items like event logs, performance metrics, antivirus definitions, etc. This is not NAS/SAN storage; it is the storage on the physical XenServer. Think about it. You buy an 8 core server, install XenServer, which is small, and the rest of the local storage is wasted. We utilize that for the persistent store of the virtual desktops. This means we cannot do XenMotion on the virtual desktops, but most customers I've spoken to do not have this requirement. After looking at VMware's reference architecture I don't see any level of detail as to the amount of storage they require. I wonder why not.
Workloads:
- Misconception: VMware states that they can get more users on a hypervisor than we can.
- Truth: This is all around scalability tests, which I'm not a fan of. I can easily find you 5 tests that show XenServer is better and another 5 that shows ESX is. The VMware reference architecture had users connected for 14 straight hours, seems like a long workday to me. I have a question for VMWare: What company did you create this architecture for where users would work for 14 hours? Please tell me as I do not want to work there. As we all know, the most typical system hit is during startup and logon. So by expanding the session time from a few hours to 14, the overall average utilization rates can be significantly lowered, thus providing an inaccurate estimate to the hardware
- Truth: The Citrix Reference Architecture made estimates based on the applications and expected real user workload, not simple apps and 14 hour workdays. VMware's reference architecture was based on standard scalability samples shown below. If this was an actual user workload, I totally want to work for that company because that job looks so easy:
- Microsoft Word - Open/minimize/close, write random words/numbers, save modifications.
- Microsoft Excel - Open/minimize/close, write random numbers, insert/delete columns/rows, copy/paste formulas
- Etc
RAM:
- Misconception: The amount of RAM that VMware recommends in their reference architecture is nuts. They say they can get 96 users on a server with 96GB RAM.
- Truth: If you subtract the hypervisor overhead you are looking at "USABLE" RAM of about 800MB per virtual desktop. I say usable because ESX has probably enabled memory ballooning. It is true that XenServer does not have memory ballooning, but I would recommend customers disable this feature for virtual desktops. On XenDesktop projects that use the ESX hypervisor, I also recommend disabling this feature. Users and desktops are more dynamic than server workloads, meaning the RAM consumption is going to fluctuate greatly. If RAM starts to decrease to the critical threshold, what happens to the hypervisor? It must free up memory by paging this to disk. Isn't this an intensive system process that consumes more resources at a time when resources are scarce?
End Points:
- Misconception: Vmware talks about the end points and only focus on thin clients and end points that we can repurpose with a Linux OS or locked down Windows OS. What about the newer end points that organizations have already spent money on?
- Truth: With VMware View you still will connect to the VDI desktop and idle your local hardware. Seems like a lot of wasted desktop resources to me. XenDesktop, on the other hand, allows you to re-use those desktops as a local streamed virtual desktop. Don't be fooled, there is more to desktop virtualization than VDI.
Provision:
- Truth: Closer to the end, the reference architecture talks about the time to provision X number of linked clone desktops. I'm not sure if this is automated or if an admin has to do each desktop one-by-one. I'll give VMware the benefit of doubt here and say it is automated, but taking 161 minutes (2 1/2 hours) to provision 500 virtual desktops seems long to me. I personally don't think this metric is important, even though XenDesktop is measured in seconds. If it is automated, you do all of this in the build out phase and not in production. So the time it takes is irrelevant to me. Why did they choose to include it? No idea
So my advice to anyone who is still reading this blog... Take everything you get with a level of skepticism. Do your own due diligence and look at the details to see if things were glossed over or if an in-depth analysis and design was completed. That recommendation even includes the materials I post. I try to be open and honest in my blogs, white papers, TechTalks and videos, but I am a little biased to Citrix because they pay my bills.
If you want to discuss more, or have further questions, then Ask the Architect
Daniel - Lead Architect - Worldwide Consulting Solutions
- Twitter: @djfeller
- For the latest desktop virtualization information visit the Ask the Architect - Next-Generation Desktop site
- Questions - Then Ask The Architect
Comments (18)
Nov 06
Anonymous says:
"Take everything you get with a level of skepticism" That should include this b..."Take everything you get with a level of skepticism" That should include this blog.
You are damn wrong on the memory part. Transparent Page Sharing will have a HUGE benefit for ESX. Also, it is pure statistical: Given let's say, 100 desktops, a few scenarios will have all them using all the memory at the same time.
The truth is clear here: The lack of Memory Overcommit on XenServer kills scalability, and forces you to buy much more hardware.
Can you imagine more demanding desktops, let's say, with 2, 3, 4 GBs ?
Also, on provisioning tasks, View kicks XenDesktop ass greatly. Look at this for ex:
http://www.youtube.com/vmwareview#p/a/u/1/icD6_p_fl2g
Also, all the HDX, hi def, 3D, VoIP stuff on XenDesktop is for LAN only !
Nov 06
Sid Herron says:
...and I am particularly skeptical of anonymous comments......and I am particularly skeptical of anonymous comments...
Nov 09
Anonymous says:
if you were a little more productive with your comments (like citing actual fact...if you were a little more productive with your comments (like citing actual facts and actual customers) rather than swearing I bet more people would take you seriously.
Nov 09
Daniel Feller says:
Transparent page sharing will help save some memory, but I'm talking about over ...Transparent page sharing will help save some memory, but I'm talking about over committing memory. This all depends on how you design your virtual desktops. If I know my desktops will consume around 800 MB of RAM, you will see a huge benefit in overcommit if you allocate 2 GB each, but that seems like a poor design to me. These types of features use technology (and costs associated) to fix poor architectural design. In a desktop virtualization solution, I recommend at least 3 different types of virtual machine specs based on the users:
1. Light
2. Normal
3. Power
This allows me to align the needs of the user with the resources allocated.
Also, if you observe desktop users for some time you will quickly see their RAM consumption increases throughout the day because they leave apps open, or apps do not release memory appropriately. So when you observer users to define virtual machine specs, make sure you observer the correct data points.
Thanks
Nov 09
Anonymous says:
Hey Daniel, Is not poor design. I never saw anyone that know exactly that your ...Hey Daniel,
Is not poor design. I never saw anyone that know exactly that your users will use 973 MB of RAM on the morning, and 1220MB at the afternoon. I am not talking about 10 to 1 overcommitment, but like Massimo mentions 2:1 , or 1.5:1 , are more or less a realistic approaches, and the more memory on a VM, more the opportunities are. You can always have a POC to figure out.
What happens when all users wants all CPU resources at the same time ? You will have a disaster on your performance. Even so, you assume a ratio between Physical and Virtual CPUs based on an average usage. Why with memory you need 1:1 relation if we know not everyone will claim all the memory at the same time ?
By setting an absolute value on memory consumption, you add some inflexibility to the environment. Imagine if you need to put a server in maintenance mode for ex. You will have to predict HW for that, to fully accommodate all VMs and their memory sizes on other hosts.
I bet Citrix is working on this (I saw a Citrix Market PDF sent to customers, stating that XenServer will have it "soon").
Regards,
Fernando
Nov 09
Daniel Feller says:
Great assessment. I love these types of debates With CPU, utilization fluctuat...Great assessment. I love these types of debates
With CPU, utilization fluctuates from second to second and flatlines when the user is idle, hence virtualization is awesome from this perspective because roughly 40-60% of your users will be idle at the same time. But memory is a different story.
We know, with a fair degree of certainty, what each user requires from an OS perspective. Then we pile on applications and consume more RAM, which you can grab the baseline. Unfortunately, that consumption continues to grow throughout the day, even when the user is idle because users keep their applications open. I keep Outlook, IM, TweetDeck, Firefox, Word and Visio open 100% of the time even if I'm not using them currently. And by the end of the day my RAM utilization is higher than in the morning because these apps, plus others i've actually closed, are not releasing their RAM as they should. So, if users stay connected for 8 straight hours, and their RAM usage increases consistently throughout the day, I have a hard time realizing the benefit of overcommit. You see a big benefit in the morning, but i would be concerned by the day the physical server is starved for memory.
As for XenServer, does it include this feature? No. Will it? No idea as I'm not in the product group. But like I said earlier, many of the designs we have done in consulting where XenDesktop is using ESX as a hypervisor, we recommend disabling overcommit as the risk was too great. This isn't to pit XenServer against ESX because the customer already made up their mind for ESX. We are just trying to create the best possible solution for them.
Nov 09
Anonymous says:
Very good points, You are right, CPU is much more an instantaneous measure, and...Very good points,
You are right, CPU is much more an instantaneous measure, and memory consumption indeed tends to raise during the day. But even so, it can vary dramatically. A user with a 4GB mailbox will demand lots of memory from Outlook, while a regular user might need much less. A heavy SAP user also demands more than the basic user. Looks like sizing the old good MetaFrame XP servers uh ? :-D
And it is not only the ballooning. It is the Transparent Page Sharing also. Can you imagine how many identical memory pages we will have on 50 XP instances, running the same set of applications?
Even if you have tons of memory, don't disable memory OM. It can be handy to give you some flexibility, providing more capacity if you quickly run out of resources, failover capacity etc etc.
Try it next time if you face a situation like this, granted you will not regret.
Fernando
Nov 09
Anonymous says:
Daniel, Another thing to consider: When you have multiple applications open, b...Daniel,
Another thing to consider: When you have multiple applications open, but inactive, the OS will have many inactive pages in memory, which are the primary candidates to be swapped. In a memory pressure scenario, the ballooning driver will put pressure on the OS to page memory o disk, and these inactive pages will be the first (after the really unused memory of course). The performance hit will be negligible, since the active working set is still in memory. So, Memory OM will still be able to claim some "used but inactive" memory from guests. But this will only happen when the ESX starts running out of memory, and after claiming back the really unused memory.
Besides that, the OS swaps memory all the time ... look at task manager on your PC when you have everything open during all the day, and add the "Virtual Memory" column for ex. The OS starts to swap inactive memory on its own, without the ballooning driver, even when there's available memory.
Regards
Fernando
yesterday at 08:20 AM
Anonymous says:
Actually the "video" has missed some parts of what you need to do on the VMware ...Actually the "video" has missed some parts of what you need to do on the VMware side of things. But as always, VMware like to hide technology for their customers. !
Nov 06
Massimo RE FERRE' says:
I am skeptical of anonymous comments too. Having this said, I have to admit th...I am skeptical of anonymous comments too.
Having this said, I have to admit there are two things are not convincing me in this analysis.
1)
>Users and desktops are more dynamic than server workloads, meaning the RAM consumption is going to
>fluctuate greatly
Isn't that the ultimate reason for which you want to use memory overcommitment? If RAM consumption was going to be fixed than it's when you wouldn't take too much advantage of memory overcommitment. Obviously you need to assume that at any point in time the actual usage of RAM per each VM never exceeds the physical RAM installed.... but isn't that the same assumption you make when you create 40 virtual cpu's on a system that has 8 physical cpu's?
2)
>You buy an 8 core server, install XenServer, which is small, and the rest of the local storage is
>wasted.
Well VMware could always say that with embedded hypervisors you do not even have local storage. From an architectural perspective, having local persistent storage is very incompatible with where we are all heading to (i.e. stateless computing). I am not so much worried about live migrating those VMs... I am worried about what happens when that dual-socket server will go down (and it will .... sooner or later).
Massimo.
Massimo Re Ferre' (IBM)
Nov 09
Daniel Feller says:
#1 -Good point regarding the server. My concern on the desktop is becaus...#1 -Good point regarding the server. My concern on the desktop is because you fluctuate so much how do you know how much overcommit you can safely do? As the day progresses, most users consume more RAM as they open more apps and keep them open. Even if the user closes the apps, there are still parts of the app that remain resident in memory. Also, not all apps are created cleanly, meaning the longer you use the app, the more RAM it will take because it is doing more activities, not cleaning up its memory consumption and might have memory leaks.
#2 - For this example it is local storage, but there is nothing preventing you from putting it on shared storage. And even though we say 3-5GB of storage per VM for persistent information, most of that space is unused. So with thin provisioning of the storage or data de-duplication, you will likely see a very small portion of allocated storage being used. In the design doc, we like to prepare the customer for the maximum usage so they aren't shocked at a later date.
Nov 09
Massimo RE FERRE' says:
Daniel, #1 True. You probably can't multiply memory indefinitely but a 2:1 is a...Daniel,
#1 True. You probably can't multiply memory indefinitely but a 2:1 is a good rule of thumb. Sure there may be customers that will be able to do 3:1 and others that have to stick on 1:1 but we can't discount memory commitment as irrelevant. It's all about overcommitting resourcing (disks included using techniques such as thin provisioning and deduplication). Memory is just one of the many (all) susbsystems that are being virtualized/overcommitted. You will get there too with XenServer.
#2 Ok. However my point was not so much about the amount of storage being used. Even a single byte is important if you can't get to it. If that was just an example that's good. I would personally vote for "everything on shared storage" to get to that stateless nirvana my anonymous friend likes so much.
Thanks.
Massimo.
Nov 08
Anonymous says:
Yes we are all headed toward stateless computing only IBM will take about 5 year...Yes we are all headed toward stateless computing only IBM will take about 5 years longer then anybody else to get there
snoooozzzzzzzzzzzzzzzzzzeeeeee
Nov 08
Anonymous says:
How old are you? You sound like a typical 13 year fanboy. Instead of being cons...How old are you?
You sound like a typical 13 year fanboy. Instead of being constructive and an interesting debater, you are just childisch and annoying.
Greetings,
John Pater Norfield
Nov 08
Massimo RE FERRE' says:
>Yes we are all headed toward stateless computing only IBM will take about 5 ...>Yes we are all headed toward stateless computing only IBM will take about 5 years longer then
>anybody else to get there
Did someone from IBM (mistakenly) stole your candies? I apologize on behalf of him/her.
Massimo.
Nov 09
Anonymous says:
The problem with the first Anon post is that even VMware support has you disable...The problem with the first Anon post is that even VMware support has you disable Memory overcommit if you are having issues, and is actually disabled in best practices. By going with Xen you will actually have enough money to throw "@ the problem" vs spending it on a license. ESX is still available as a hypervisor with the XenDesktop product if you feel Memory OC is such a magic bullet.
As for provisioning, it may take some more time on the front end, but you can quickly get your self in link-clone hell with View, but then again you can just throw more storage at it.
LM
Nov 09
Anonymous says:
That's a misconception. VMware never recommended to disable memory overcommitmen...That's a misconception. VMware never recommended to disable memory overcommitment. This is on by default, and everybody uses it.
This is very proven and reliable, as long as you do not plan to overcommit 10:1 :-D
Att
Fernando
yesterday at 08:14 AM
Anonymous says:
Actually Fernando, you are wrong. Or maybe it depends which country you are loca...Actually Fernando, you are wrong.
Or maybe it depends which country you are located at.
For ROI/TCO measurements VMware always recommends this type of features, as they can show how much they can consolidate. However, practical and theoretically it is two different stories, 95% of my customers has these features turned off, because when the going get tuff the tuff gets going.
Add Comment