Over the past week I was discussing vSphere 5.5 best practices with regards to Nutanix with Josh Odgers (Fellow VCDX and member of the Nutanix Solutions and Performance Engineering Team), who is putting together a comprehensive paper on vSphere 5.5 best practices for Nutanix. One of the topics that came up in our discussions was availability and capacity planning, such as the number of node failures to tolerate, and how to make capacity planning easy and ensure you can always meet your failure tolerance. Unlike many storage architectures there is no fixed limit to the number of node failures a Nutanix environment can sustain before data protection may become an issue. It all comes down to your design and how much free space there is to re-protect your data. If for example you have a 64 node Nutanix cluster that is 80% full, you can potentially survive the loss of a more than 10 nodes (from a storage perspective) before your ability to re-product the data becomes an issue. But even at small scale, say 3 nodes, what is the best way to avoid trouble and be able to plan for capacity and failure, and ensure your data is protected?
Any environment where rapid storage growth could result in an out of space condition requires monitoring and capacity planning. Every environment at some point will experience a failure. Hardware eventually fails, software eventually works. Hardware fails, software has bugs, people make mistakes. Choose whichever cliché you like, we as IT professionals need to plan for capacity and also for failure no matter what architecture we choose for our environments. Fortunately we can make this process very simple in a Nutanix environment.
When planning your Nutanix environment you should plan for maintenance, availability and failure up front. If you want to be able do hardware or hypervisor maintenance on a node non-disruptively, then you’ll have to have capacity for that built into your environment. If you want to be able to do maintenance and survive a node failure at the same time, then you need to build in that capacity as well. At a minimum your Nutanix environments, just like your hypervisor environment, should be planned for N+1 failure. If you have a total of 3 nodes and N+1 for failure then you have 2 nodes worth of capacity that can be used before impacting your recovery posture.
My general rule of thumb is similar to vSphere environments where you may wish to have a node for failure for each 12 or 16 nodes in your cluster. So if you have a 32 node cluster you might want to logically have N+2, so 30 nodes of capacity and survive a two node failure. Some environments prefer to have a node for failure in every 8. So a 32 node cluster would be N+4 and 28 nodes worth of capacity before impacting recovery.
This makes sense and it’s what we’ve all done for years. We’ve got HA built into vSphere and we have admission control enabled to ensure we don’t overload our environments to the point that we can’t sustain a failure that might risk the important VM’s not getting protected and recovered. But how do we apply the same admission control at a storage layer? To ensure we can always re-protect the data in the case of a catastrophic node failure?
The easiest and quickest way to do this in a Nutanix environment is by using a Container Reservation and a FreeSpace Container. A FreeSpace Container is an idea Josh and I came up with to make capacity planning simple, and so that administrators don’t over provision storage to the point that an out of space condition could result from a failure. It is essentially like admission control on the Nutanix storage layer. Here’s how it works.
A FreeSpace Container is a container that you’ve provisioned, without mounting it on any hosts, that has a reservation equal to your failure tolerance. The FreeSpace Reservation is based on raw capacity before any Replication Factor or Resiliency Factor is taken into account. So for example in a 10 node cluster you might set a FreeSpace Reservation Container to 1 node or 10% of your raw capacity (e.g. 4TB if using Nutanix 3000 series). Just like vSphere Admission Control would be set to 10% of compute capacity. The available free space shown to your hypervisor will be minus the free space reservation. So you will easily see when you’re running out of capacity from the hypervisor side and need to add nodes. This also ensures you can survive a failure of the number of nodes you’ve designed for.
[Updated 04/12/2018] Note: The reservations in recent versions of AOS use logical space instead of physical space. When this article was originally written 4 years ago the reservations were based on physical space. 4TB Physical at RF2 = ~ 2TB Logical at RF2.
Here is a diagram that shows a 16 node cluster with a failure tolerance of 2 nodes and the setting for the FreeSpace container.
In the case of the example above and using RF2 for data protection you would have a usable storage capacity 56TB visible to the hypervisor after the FreeSpace Container has been created. If you were planning a 4 node cluster with a failure tolerance of 1 and 4TB in the FreeSpace Container then you’d have usable storage visible to the hypervisor of 12TB. If you have a mixed cluster with different node models then you should plan on tolerating a failure for at least the biggest node in the environment.
It is not a good idea to run out of free space in any storage environment. If you run your storage environments too full not only do you risk data integrity but also performance. Most storage architectures recommend that you have some sort of free space buffer. Fortunately by choosing a platform that makes more efficient use of the underlying storage and easily reclaims free space when VM’s or virtual disks are deleted, has built in compression and data de-duplication, such as Nutanix, you’re already better off. By using a FreeSpace Container that is not mounted to any host you can ensure you’ve always got a safety net, a buffer to keep you out of trouble, like admission control in vSphere HA. All you need to do when adding new nodes to the Nutanix cluster is to modify the FreeSpace Containers Reservation to match your failure and capacity tolerance. Simple as that.
I hope you’ve found this useful and as always your feedback and comments appreciated.
This post appeared on the Long White Virtual Clouds blog at longwhiteclouds.com, by Michael Webster +. Copyright © 2014 – IT Solutions 2000 Ltd and Michael Webster +. All rights reserved. Not to be reproduced for commercial purposes without written permission.