In a previous article, Nutanix Disruption as a Service Serves Up VDI Assurance World First, I wrote about the Nutanix VDI Assurance program that allows customers to pay for VDI Infrastructure on a per desktop basis (perpetual or term based), based on certain pre-defined user profiles, and guarantees the performance and service levels (takes away the risk). It included in that article an architecture diagram that would allow up to 10,000 Power User desktops, or 20,000 Task Workers. But like a lot of people I always like to do more, and I got some questions about how I’d scale up to even larger numbers. This article will answer those questions. Below I present an architecture that is initially sized for 20K Power Users or 40K Task Workers, but can scale to 200K+ Power Users, just by adding in more of the standard building block components. I’ve also included a diagram of how this might look logically in a multi-site scenario.
The diagrams below assume the standard Nutanix definition of a Power User or Task Worker. The platform is based on the Nutanix VDI Assurance Model NX-3060 node type (256GB RAM). Each rack is a modular unit or building block and you can just keep adding racks. This is enabled by the underlying network architecture being a leaf-spine design, and the web-scale architecture and linearly scalability of the Nutanix Virtual Computing Platform. Information on leaf-spine network architecture can be found from Cisco and Arista and others.
The Nutanix Virtual Computing platform delivers Power Users desktops at <6w per desktop, and Task Workers <3w per desktop at any scale. This is much more power efficient than competing solutions. The density of 2.5K Power Users or 5K Task Workers per rack includes resources for N+1 resiliency per VDI cluster. When looking at the design, remember that this includes all server compute, storage, and networking components for the entire solution. There are no separate racks for storage or network equipment (except the uplinks to the WAN, which are not shown).
What makes the VDI Assurance model from Nutanix so simple is that you don’t need to worry about this detail. Nutanix takes care of it for you. I just drew this pretty picture to get you interested. You simply have to know the number and profile of the users you need to host and Nutanix will give you the right infrastructure to run them with guaranteed service levels. If it’s not performing then Nutanix will fix, which may mean deploying more hardware, at no additional cost. You pay per desktop in packs of desktops for a perpetual or for a term (1/3/5 yrs). That’s it. Uncompromisingly simple.
Note these diagrams are of my creation (based on previous example diagrams and good work by Steven Poitras – http://stevenpoitras.com/, author of the Nutanix Bible) and your actual deployment may be different to this based on Nutanix VDI Assurance model. This is for informational purposes only, but I did spend a lot of time calculating the numbers to make sure the design would work. This is still one component and an over simplification. These designs could be deployed with your favourite VDI solution, so you can choose between either VMware Horizon View or Citrix XenDesktop.
VDI for 10K Power Users, 20K Task Workers
This is the design diagram from my previous article.
VDI for 20K to 200K+ Power Users, 40K to 400K+ Task Workers
If you click on the image it will be expanded and easier to read. This can easily be expanded to 72 racks by adding line cards to the spine switches using the MLAG approach. Using ECMP (for above 72 racks) you could add an additional 36 racks, for a total of 108 racks without any significant modifications to the design or it’s building blocks. You could keep going, but that would require some modifications to the design. Although I have not tested this design in the real world, I have calculated all the various components of the solution based on the Nutanix VDI Assurance model for VDI user profiles and nodes. With the Nutanix VDI Assurance model though you really just have to know how many you need, and then Nutanix will do the rest.
This is by no means the only way to achieve this result. This is just one of many possible architecture options. But I think this shows the power of the Nutanix Virtual Computing Platform to deliver on the VDI use case while consuming an efficient power footprint and a very efficient datacenter footprint. There is more to a VDI design than a single diagram, but this should get the creative juices and imagination flowing.
Taking VDI to Mult-Site for DR and Even More Scale
So how do you provide DR and multi-site resilience and scale to a VDI design based on the Nutanix Virtual Computing Platform? Good question. Because VDI is a business critical application, especially when deployed at scale, you need to make sure you have headroom to handle failure and disaster scenarios. Here is a logical diagram of how this might look. You can click on the image to make it bigger.
Because the Nutanix Virtual Computing Platform has data replication built in, you can easily protect your master images and replicated them to one or more sites. You can quickly clone images using the Nutanix VAAI integration, and you can quickly deploy, refresh and power on desktops. Data Deduplication, shadow clones, and the unique data locality features of the Nutanix platform make sure your desktops always receive optimal performance. With the Nutanix solution you get web-scale, cloud economics, high performance, simplicity and choice, at your place, and at your pace.
The above solution delivers power consumption for power user desktops for < 5W per desktop and < 2.5W for task workers. Take a look at the Nutanix VDI Assurance program, go through the Nutanix Product Info and Tech Papers, and check out the Case Studies. As always your feedback and comments are appreciated. Let me know your thoughts on scalable, large scale VDI design. If you want to see how the Nutanix solution compares to other reference architectures check out Battle RA Royale: More VDI For Less Moolah.
This post appeared on the Long White Virtual Clouds blog at longwhiteclouds.com, by Michael Webster +. Copyright © 2014 – IT Solutions 2000 Ltd and Michael Webster +. All rights reserved. Not to be reproduced for commercial purposes without written permission.
I'm afraid I don't realy see the value of the Assurance program. I do a lot of work for customers on VDI. The first and most important aspect of any VDI project is to fully understand exactly what the physical desktops are doing while running in production. You need to understand user/system resource consumptions and constraints. You need to understand what the use cases are.
As you could imagine, being able to collect/analyze this information across hundreads/thousands of users is very difficult. It's no different than the days of server virtualizatiuon where you would need to conduct an assessment using something like Capacity Planner to capture the important metrics so you could intelligently understand which servers to virtualize, and what the ESX(i) footprint would need to look like.
Asking a customer to just tell you what Nutanix bucket their users fall into is asking for the project to be undersized or oversized. A desktop assessment has to be performed to properly vet out the use cases and be able to rightsize the infrastructure. The assessment data will be able to tell you what the use cases actually look like.
For example, I've done many assessments where the customer use cases were heavily skewed on CPU and IOPs, but not RAM. Others have been skewed on CPU and RAM but not IOPS. Others RAM and IOPS and not CPU.
There really isn't such a thing as real world VDI user profiles. Each customer has unique infrastructure, applications and use cases that drive their own unique user profiles. Establishing a set of criteria for a Task Worker (CPU, RAM, IOPS, Storage) just doesn't make sense IMO, and leads to customers trying to fit a square into a circle.
As a customer I'd rather define what my users need based on actual data and then procure infrastructure based on that. If my users require 2 x vCPU, 4GB RAM, 20 IOPS I procure for that. They might require 1 x vCPU, 4GB RAM, 50 IOPS, so I'll look to procure for that.
I just think it's too easy to ask people to guess which pre-defined bucket their users fall into and build a VDI infrastructure based on that. If I do my due diligence and conduct an assessment at the start, I have the data whereby I don't guess and know what I actually require.
Valid points Forsby, but I don't think Nutanix is telling anyone to guess on their user metrics. I look at it in this way. After I do my user studies and I understand my requirements I can then use those to order from the Nutanix VDI program.
This gives me the assurance that what I order will meet my requirements. This should help people with the infrastructure layer. And they plan to stand behind what they sell. You're not going to get any flash vendor stand behind something like this. They will look to prove that the storage is doing its job and pass blame to something else.
Hi jordan57. If you do an assessment you already know your requirements. At that point you should be able to go to Nutanix, or any other vendor(s) and procure what your actual requirements are. The assessment puts you in a place where you're not worried about your platform falling over.
Telling a customer that you just have to know the profile of the users you want VDI for, is being far too simplistic IMO. There's not really such a thing as real-world VDI profiles. You see them all over the place used for high level simplistic sizing exercises, but they aren't realistic. I've used them before for past projects and I can safely tell you that way doesn't work.
So, now that you've done your assessment and you feel you want to leverage this assurance program, what do you do when your use case resource requirements don't fit any of the Nutanix Assurance use cases? All I'm trying to say is that once you know you need a car with 350 horse power, 4 wheel drive, leather seats, A/C and GPS, you go and order exactly that – not something that doesn't meet your requirements.
Hi Forbsy, you're absolutely right, especially when it comes to traditional infrastructure. There is no one size fits all and Nutanix isn't restricting your choices. You can still purchased desktops based on the old model, but then you're on the hook to guarantee it works. If you make a miscalculation with your assessment or architecture, then that risk is on you. Nutanix is providing flexibility in business model and reduction in risk.
There is nothing stopping you purchasing a bunch of VDI Assurance packs and also buying some Nutanix blocks based on specific profile requirements that don't fit the pre-defined model for a particular use case. But the legacy architectures also don't scale in the same way and place many more restrictions, which is so much importance has had to be placed on the assessment. But if you look at the averages in a lot of cases the profiles across a system even out, provided you can meet the peak. Due to the unique features of the Nutanix platform (including linear scalability, data locality, dedupe and shadow clones) we're able to take a lot of the risk out, and with a margin of safety, provide a solution that can meet the vast majority of VDI requirements, based on a very convenient procurement model, while still giving customers the choice of other options. If you choose the simple option you get service levels guaranteed. You can purchase on demand, and on a per-desktop basis. Even with the normal Nutanix system procurement approach due to the hyperconvered and efficient platform you can reduce your risk of a miscalculation in performance and profile, quickly and easily. I've done large VDI projects on traditional infrastructures and been through the assessments. Even after the assessments the architecture isn't that easy to design and deploy in the traditional legacy architecture way. This new program is one way of solving that problem and making things more simple for customers. More simple is better as everything is too complicated and VDI is much more complicated than it needs to be.
I agree that different environments also face different resource constraints, but in almost all environments I've experienced, the storage has always been most problematic and hardest to change. The Nutanix platform addresses all of the resource areas in a single solution, storage performance, storage capacity, CPU and RAM. That's why it fits the VDI and many other use cases well. If you have a resource constraint in one area you just add nodes and that balances out the utilization and constraints. With a distributed system your peaks are also distributed, so you are better able to handle unexpected variances in load. Something that is much harder to handle in a traditional architecture, non-hyperconverged approach, at least at the storage level.
All the cloud providers, even Google and Amazon, as well as others are providing Desktop as a Service for certain profiles. Why not have the cloud economics, cloud type procurement, cloud simplicity, but in your datacenter, with service guarantees, and under your control? Thanks for the feedback and also the robust discussion.
[…] disruptivo al entorno de virtualización. Hoy os quiero dejar con un post increíble, se trata del diseño físico y lógico de una Infraestructura basada en Nutanix que puede llegar a darnos soporte para una de estas dos […]
How many nodes/hosts are in your VDI clusters shown in the first pic (assuming these are Nutanix clusters), and how many node/hosts failures can each cluster tolerate?
Each VDI Cluster in the first image has 24 Hosts. This is with N+1 resilience, so it will tolerate at least one host failure. However the Nutanix Cluster is 48 nodes, i.e. covering the entire rack, and has N+2 resilience, so can tolerate two node failures. One Nutanix Cluster can run multiple VDI Clusters and in this case I am depicting one Nutanix cluster of 48 hosts running two VDI Clusters of 24 hosts each. This increases resilience and also performance. I’ve also added a link back to my previous article there so you can get additional details. If you look at the diagram I have said that vCenter and the Nutanix Cluster is Per Rack.
[…] Long White Virtual Clouds Nutanix VDI Example Architecture […]
+1 on what Michael is describing. This aligns with the VCDX methodology for infrastructure design. There is also one thing to consider when looking at what cloud providers support. There is consumer grade and there is enterprise grade. There is front-end infrastructure and back-end infrastructure. QA environments have common uses cases and some corner use cases which result in design guidelines to help in design choices and design patterns used. The above calls out an approach to provide flexibility so that you can support greenfield design or upgrade design solutions.
[…] will be from anywhere on any device at any time, which suits private cloud environments. Typically VDI is built on specific hardware, but increasingly integration opportunities between VDI and Cloud vendors are […]
[…] Nutanix VDI Example Architecture for 20K to 200K+ Power User Desktops […]