High Availability is a key consideration for any VMware vSphere Design. VMware HA is a very easy and effective tool that you should always enable to improve VM availability. vSphere 5 introduces a considerably enhanced mechanism to achieve high availability that removes the limitations of the previous versions. As a result it is much more easily achievable to have clusters that contain a far larger number of hosts.With the enhancements to VMware HA in vSphere 5 there are some considerations that are important to take into account, especially in blade environments, to achieve adequate availability in different failure scenarios. With much larger clusters and also with clusters that will contain business critical workloads it’s important that you consider HA not just in terms of N+1 hosts, but also when N+1 does not equal 1 host.
This will not be a deep dive into VMware HA and Admission Control, but it will contain a brief overview of some of the design considerations that are necessary in all environments and some differences between vSphere 4.x and 5.x. The best reference for VMware HA and DRS is Duncan Epping and Frank Denneman‘s book VMware vSphere 5 Clustering Technical Deep Dive. I highly recommend it and every VMware admin should own a copy.
Firstly some brief background on VMware HA in 4.x and 5 Blade Server environments.
VMware HA in a vSphere 4.x Blade Environment
In vSphere 4.x VMware HA used a concept of primaries and secondaries. There were up to 5 primaries in a cluster that were responsible for ensuring VM’s were restarted in case of host failure. In a blade server environment this meant you should deploy no more than 4 hosts per chassis to guarantee that in the case of a chassis failure at least one primary would survive to coordinate VM restarts. There was no mechanism to guarantee where the primaries would live within the cluster. This had the effect of somewhat limiting flexibility and cluster sizes. Most organizations would split up clusters across multiple blade chassis and always have less than 4 hosts per chassis, in most cases only 1 or 2 hosts per cluster per chassis was deployed to reduce the fault domain. Most clusters were deployed with only up to 8 hosts to limit the number of chassis that were required.
VMware HA in a vSphere 5 Blade Environment
VMware HA in vSphere 5 replaces the previous AAM (Automated Availability Management) module with FDM (Fault Domain Manager). The new FDM agent has completely done away with the multiple primaries concept and instead replaced it with a master and slave arrangement. If a master fails a new master will be re-elected. This means it is much less of a problem to have more than 4 hosts per chassis in the same cluster purely from a VMware HA perspective. In the case of a chassis failure a new master will be elected on a host within the cluster on another blade chassis and any failed VM’s will be restarted. This allows cluster sizes to be much higher, potentially with less blade chassis, and provides much more design flexibility. You may be much more likely to design environments with 4 or more hosts per chassis in the same cluster. Here though I would like to introduce one of my design maxims: just because you can doesn’t mean you immediately should.
When N+1 is not Equal to 1 Host
I have recently completed a couple of engagements where I designed environments for very large financial institutions where their clusters will start off with between 10 and 16 hosts each form day one. Both environments will be using blade servers, and both environments have fairly strict high availability requirements. This design highlighted a situation where N+1 availability in a cluster may not just equal 1 host. In both of these recent cases the customers required that the cluster continue to operate without major performance impact not just in the case of a single host failure, but also in the case of an entire chassis failure. So in this case N+1 equals N+1 chassis, not just N+1 hosts. HA Admission Control then needs to be configured to ensure sufficient resources are available to restart all necessary VM’s in the case of a chassis failure. The diagram below provides an overview of an example layout.
In the above diagram you can see the Management Cluster consists of 4 hosts, and each host is deployed into a separate chassis. The Resource Cluster contains 16 hosts, of which 4 could fail without having a major impact on availability or performance. In the case of this design I chose to specify HA Admission Control equal to a percentage of cluster resources reserved for failure, equal to two hosts, and reserved another two hosts through capacity planning and performance management processes. This allows for sufficient maintenance and growth capacity at the same time guaranteeing availability. Depending on the number of hosts you can afford to loose per cluster you may still want to deploy less hosts per chassis and have more chassis. If you have a large environment that will contain multiple clusters and a large number of chassis it may be possible to have a large cluster and still only have 2 hosts per chassis.
The above should not just be a consideration for very large environments with large numbers of hosts, chassis and many clusters, but should also be a consideration if you have a more moderately sized environment and want to combine some existing clusters into one of fewer larger clusters when you upgrade to vSphere 5.
Another Example Where N+1 is not Equal to N+1 Hosts
Also with the previous project the customers wanted to ensure that if a single rack became unavailable through a power fault or maintenance that the cluster would still survive with limited impact to performance. To achieve this the blade chassis were split into different racks within the datacentre. This would ensure any single rack issue would have the same failure domain as a single blade chassis failure. In this case N+1 equals N+1 Racks, not just N+1 hosts or blade chassis.
DRS Affinity Rule Considerations with Blade Environments
When you have multiple hosts per chassis in the same cluster in a blade environment without additional configuration it is possible that VM’s with a specified DRS Anti-Affinity Rule may be separated across hosts within a single chassis. This will potentially cause you problems as there is no guarantee in this scenario of really keeping the VM’s separated and availability will be impacted in the case of a blade chassis failure. The same problem occurs with Fault Tolerant VM’s, which may run on two hosts within the same blade chassis. It is not possible to assign the primary and secondary VM of a fault tolerant pair to different DRS groups. Also VMware DRS and HA does not currently contain any chassis, rack, or site awareness or tagging capability, which would be very useful to help address advanced availability considerations.
To achieve the separation guarantee across chassis you should specify one or more DRS Host Groups, VM Groups and VM Group to Host Group rules in addition to the Anti-Affinity rules. The DRS Host Groups should be defined horizontally across the hosts in the cluster across blade chassis. This will ensure you guarantee that VM’s that need to be kept separate will not reside on two different hosts in the same chassis. The diagram below provides an overview of this configuration.
Configuring one or more DRS Host Groups will mean that you can assign the relevant VM’s to the correct grouping, but this adds slightly to management overheads. This could be automated during provisioning time with workflows in vCenter Orchestrator, by PowerShell scripts or executed manually by an administrator. This should only be used as the exception and not the rule as it limits the placement options for VMware DRS and may reduce cluster efficiency and flexibility. It will also not work well when using vCloud Director as vCloud has no knowledge of the host groups. In theory this could be mitigated by using blocking tasks and integration again with vCenter Orchestrator.
What about Rack Server Environments?
Similar considerations apply to rack server environments also in terms of rack failure and distribution of hosts and clusters across multiple racks. Outages or maintenance of racks should be taken into consideration when you are designing your clusters. You may also want to define DRS Host Groups that span racks, so you can ensure VM’s are separated across physically different racks, not just hosts within a rack. It will all be determined by your business and availability requirements.
Don’t just consider cluster failure scenarios solely in terms of N+1 hosts. Ensure you consider chassis and rack failure and maintenance scenarios too. This is particularly important in large enterprise and cloud service provider environments. Always consider the three maxims of cloud computing when you are designing your environment – Hardware fails, software has bugs, and people make mistakes. Try and limit the failure domain for each different failure scenario. Configuration problems and problems during upgrades are more likely to impact availability than just random hardware failure. Just because your vendor says the upgrade is non-disruptive doesn’t mean it will be, trust but verify and always limit risk where possible!
Don’t design an elaborate and complex HA cluster just because you can, always consider business requirements, cost objectives and weigh those against availability and risk. There is no one size fits all approach.
Lastly, buy Duncan and Frank’s book – VMware vSphere 5 Clustering Technical Deep Dive. It is very reasonably priced and you won’t regret it. Before you ask, no I don’t get any kick backs or commission from recommending it. It is simply a reference I don’t think any admin should be without.
This post first appeared on the Long White Virtual Clouds blog at longwhiteclouds.com, by Michael Webster +. Copyright © 2012 – IT Solutions 2000 Ltd and Michael Webster +. All rights reserved. Not to be reproduced for commercial purposes without written permission.