If you have an application that needs very high service levels for availability (99.999%), 24/7/365, including maintenance and patching, then OS Clustering with shared storage is a proven solution. However it can be very complex to set up and maintain Fibre-channel, iSCSI or direct attached shared SCSI solutions. In some cases complex configurations are required not just in the hardware, but also the Operating System of the Guests. If you add in virtualization to the mix, the complexity level can increase more, with the need for physical mode raw device maps. Ironically, increased complexity can decrease overall availability, especially with an increase in the probability of human error. So how do we increase availability, decrease complexity, and provide a simple solution for OS Clustering with shared storage?
Nutanix AOS 5.17 with AHV has the answer. Nutanix AHV allows shared storage for Guest OS Clustering without any complex back end storage support or configuration, unlike other hypervisors from leading vendors, which still require Fibre-channel storage if you wish to use virtual disks. From Nutanix AOS 5.17 onwards you are able to configure a shared volume group and directly attach it to 2 or more VM’s and set up Guest Clustering without any complex in guest OS storage configuration at all. There is no complex storage back end, as that is all provided automatically by Nutanix AOS, and no in guest storage configuration, that might ordinarily be required if using iSCSI. This makes the use of Guest Clustering incredibly simple, as well as being very easy to automate, and significantly less difficult to support and troubleshoot.
The process for creating a Guest OS Cluster has 3 main steps:
- Create 2 or more VM’s with the OS of your choice
- Create a Volume Group with the number and size of virtual disks that you want for your clustered applications and attach it to the VM’s
- Configure the clustering software inside of your chosen OS and deploy the applications
Here is an example of how the Volume Group might look in the storage section of Prism for AHV:
While it’s possible to have up to 256 vDisks or Volumes within a Volume Group it is recommended to have 32 or less. If you need more Volumes you can create more Volume Groups.
When you attach a Volume Group to VM’s they will be listed in the Volume Group page within the Storage section of Nutanix Prism Element as follows:
If you wish to have a mixed virtual + physical cluster you can choose to enable external client access to the Volume Group. Any physical / external clients can then use iSCSI Initiator to connect to the clusters Target Data Services IP (DSIP) and mount the volumes.
In the example above I created a 4 node Windows 2016 Cluster, which will host SQL Server 2016 as the primary application. The VM’s are listed below, along with an AD Domain Controller:
After the Failover Cluster Manager components and tools are installed you can configure the Failover Cluster. Note: as part of the cluster creation a verification wizard is executed to ensure compliance with the strict rules needed to form a cluster, including shared storage tests for SCSI fencing and persistent reservations. The nodes in this case were displayed as follows within Failover Cluster Manager:
The next step is to install SQL Server on the cluster nodes and assign all the necessary dependent resources, which would look like the following:
I installed a second SQL Server instance in the same Failover Cluster so I could do comparisons between different configurations. You can see that in the image below:
After I created the cluster I did a series of tests including using tools such as HammerDB and Benchmark Factory for Databases. During the tests I performed live migrations to ensure the cluster didn’t blink in spite of the load, and it worked flawlessly.
Nutanix AOS 5.17 and AHV makes creating guest clusters simple and quick and supports both Linux and Windows Guest OS types. You can now configure your fav clustering solutions without the traditional complexity and that means it’s way easier to automate. A Nutanix AHV cluster can now support any number of cluster nodes supported by the OS vendors. The next step in the evolution of this will be when AHV supports Metro Cluster across sites, along with volumes, which will allow for geo distributed guest clusters with greatly reduced complexity compared to the traditional implementations. The cluster example in this article with Windows 2016 and SQL Server 2016 was created just by following the standard Microsoft Documentation and directly attaching a Nutanix Volume Group on AHV directly to the 4 Windows VM’s that would form the cluster. That’s it, no special tuning or complexity needed.
This post first appeared on the Long White Virtual Clouds blog at longwhiteclouds.com. By Michael Webster. Copyright © 2012 – 2020 – IT Solutions 2000 Ltd and Michael Webster. All rights reserved. Not to be reproduced for commercial purposes without written permission.
I follow these steps to a T and I can’t get it to work. The cluster configuration claims that there are no disks suitable for shared storage.
I’m running Windows Server 2019 Datacenter and SQL Server 2019 but that shouldn’t matter as I see it.
The validation report says that “The port driver used by the disk does not support clustering. Disk bus type does not support clustering. Disk is on the system bus.”
Any idea what I can do to sort this out?
You need to be using scsi disks and need to have the Nutanix VirtIO drivers installed.
Thanks for a very quick reply. 🙂
I have the VirtIO drivers installed. Scsi disks where? The guests only see what the VirtIo drivers present to them, or am I missing something here?
The VM config sets the disk type. Did you use the open source VirtIO driver or Nutanix version? Which AOS version?
Ah, yes that’s SCSI. Nutanix-VirtIO-11.3. I remember having to dig around a bit to find drivers that worked for WS2019. AOS 220.127.116.11.
Ok. Thanks. That’s a supported configuration. I tested it in fact. Please log support request. This should be working. If you follow what I did it should pass all tests.
I’ll check with my Nutanix SE so no need to spend more time on this unless you want to figure it out for yourself. I just wanted to make sure that I’m not missing anything obvious.
When I have it resolved I’ll post my findings in this thread.