VMware’s vCloud Director is a very good way for organisations to start to take advantage of cloud computing, including private, public and hybrid models. Cloud computing can offer new efficiencies and cost savings for organisations that optimise it’s use. But where do you start? If you go searching for best practices, design considerations and references for designing and building a vCloud Director environment you will find plenty for large scale deployments. But it might seem difficult to find much in the way of design considerations for starting off with a small vCloud Director environment, such as for a proof of concept, small lab or pilot. I have previously written about Considerations For Designing a Small vCloud Director Environment – Allocation Models, which discusses takes you through which allocation models there are and which ones you might want to use. In this article I hope to offer some advice that will allow you to dip your toes in the water and get up and running quickly without much complication, while allowing you to scale up in the future. This article is a continuation in the series and looks at the different options for storage at a high level and offers some recommendations that you might want to consider. Future article will discuss the rest of the components you need to consider.
Even if designing for a small environment one resource I highly recommend you review if the vCloud Architecture Toolkit (vCAT), which is a VMware Validated Architecture for Cloud Computing and supporting tools. The vCAT is fully supported by VMware, so if you leverage the design considerations and guidance contained within it you know you can get support.
This article focuses on Storage Design in vCloud Director. In future articles I will cover additional design considerations.
Pre-requisites and assumptions
- You have enough knowledge to install and configure a simple VMware vCloud Director environment.
- You have a minimum of 3 hosts available (1 for Management, 2 for vCloud consumption).
- You know what a Provider Virtual Datacenter and Organisation Virtual Datacenter are.
Designing Storage for vCloud Director
Classes of Storage and Provider Virtual Datacenters
In versions prior to 5.1 of vCloud Director your storage design and storage allocation would have had a major impact on how many Provider Virtual Datacenters (PvDC’s) and Clusters or resource pools you would require. This is because each PvDC could have only a single tier or class of storage. So the storage tier or class became directly linked (tightly coupled) to the overall service definition of the PvDC, i.e. the characteristics that define what a particular class of service is made up of, such as standard, enhanced or premium (Bronze, Silver, Gold) etc. As an example in your Standard PvDC you might have had 2.4GHz Xeon CPU’s, 96GB RAM (1066MHz) per Host and NFS Datastores backed by SATA disks (RAID 6 or RAID DP). In your Enhanced PvDC you might have had 2.93GHz Xeon CPU’s, 256GB RAM (1333MHz), and FC Connected SAN with FC 10K disks in RAID 5. As soon as you wanted to offer another tier of storage you would have needed to offer another PvDC, even if the rest of the service model wasn’t changing.
So what is the impact of adding another PvDC if you want to add another class or tier of storage? Well if you were to follow standard VMware Design Guidance (this is the new term that has replaced best practice), then you would have to set up an entirely new cluster of hosts. Each PvDC should be backed by a Cluster of 2 or more hosts with HA and DRS enabled. In theory it is possible to use a single host in a PvDC, but then you can’t test HA or DRS functionality with vCD. You may be able to see a problem here for a small environment that will be used for a Pilot, PoC or Lab.
The answer to this conundrum in versions of vCD prior to 5.1 is to use resource pools inside of a cluster of 2 or more hosts, instead of using different clusters of hosts, to back your PvDC’s. Each PvDC is then mapped to one of these resource pools. The different tiers or classes of storage are allocated to all hosts in the cluster and the allocated to the correct PvDC. OrgVDC’s are then created in the correct PvDC to consume the different classes of storage. There was and still is no option (as of vCloud Director 5.1) to have a single VM utilise resource from multiple service tiers or classes of storage.
Using resource pools instead of clusters does have drawbacks. Firstly there is the resource pool priority-pie paradox, which may impact the allocation of resource for any given VM or sibling resource pool. Secondly a resource pool is not allowed to consume 100% of the parent resource pools resources. Depending on versions of vSphere backing the resource pool it might only be able to consume 94% of the parents resource pool. This will potentially leave 6% unallocated. You’re also not able to segment service definitions by different type of compute characteristics as resource pools span the cluster, which isn’t a problem if you’re doing this for the sole reason of allowing multiple classes of storage. Multiple PvDC’s may be competing for resources though. It will be more complicated when you come to expand out the environment, for example if you wanted to move a PvDC to another cluster.
The workaround to provide different tiers or classes of storage to a single cluster is only required in versions of vCloud Director prior to 5.1. From vCloud Director 5.1 you can assign multiple storage tiers to a single PvDC and even use Storage Profiles, Storage Clusters and Storage DRS, which was also not possible prior to vCD 5.1. Thus the class or tier of storage is now not required to be coupled or linked to the overall service definition. You can now have multiple storage service definitions per service class, such as Big Data (SATA) and Fast Data (15K FC), in standard, enhanced and premium PvDC’s. This means you can have a single PvDC in a single cluster of 2 hosts, and still have multiple tiers of storage. This is a very good reason for using vCloud Director 5.1 for your vCloud environment.
The above assumes that you are not using Auto Tiering on your arrays to automatically allocate the different tiers of storage. If you are using Auto Tiering then you would only have multiple service tiers and therefore PvDC’s (prior to vCD 5.1) or Storage Clusters and Storage Profiles (vCD 5.1 and later) if you are offering different auto tiering policies. When using array auto tiering with vSphere 5.1 and vCloud Director 5.1 it is recommended that IO load balancing be disabled. This is because the array is handling that for you and SDRS may make false recommendations.
Datastore Sizing
Because of the somewhat unpredictable size of VM’s in a vCD environment it’s generally recommended to size datastores to be fairly large and have fewer of them. In a normal vSphere environment you may have had datastore sizes of 292GB, 512GB or 1TB etc for enterprise workloads, however in vCD you might want to use 2TB or 4TB sizes (or larger), and have fewer. This allows better placement of VM’s when the size of each VM varies a lot. This also helps a lot when using Fast Provisioning (Linked-Clone Technology) explained below. The lager size datastores also help reduce the risk of running out of space during rapid growth. This guidance is generally applicable to test/dev, pilot, PoC or lab environments and is what I’ve found generally works well. Having the larger size datastores allows potentially more efficient use of storage, especially when combined with thin and fast provisioning (explained below) but may trade off ultimate performance. A lot will still come down to the storage technology you’re using.
Thin Provisioning and Fast Provisioning
Thin Provisioning and Fast Provisioning are both techniques that when implemented allow for more efficient use of storage resources and for the potential of storage overcommit. Provided you take care both can be used safely and allow for significantly better storage economics. Like all techniques that allow improved efficiency due to overcommitment the underlying assumption is that not everything needs the same scarce resources all at the same time. I’ll briefly explain how they work and what you need to watch out for.
Thin Provisioning works by only allocating blocks to a VM as they are needed by the guest operating system, unlike a thick provisioning which allocates all storage assigned to a VM up front. This means that in cases where you know 20% or 30% of allocated storage is only there ‘just in case’ or ‘because the OS said we had to’, you can effectively use that elsewhere for other VM’s. Provided not all of the VM’s that are using thin provisioning need all their storage all at once you can get effective overcommitment of the underlying datastore and this boosts your ROI. All without any significant performance penalty (assuming modern hardware environment). Using Thin Provisioning can introduce an additional management overhead as you have to ensure your datastores don’t run out of space. Storage DRS in vCloud and vSphere 5.1 environment can help with this though and reduce the management overhead. Even with the additional management overhead in many cases the improved economics can far outweigh the risks.
Fast Provisioning is the vCD term for Linked-Clones (Similar to vCD’s predecessor Lab Manager and VMware View). With Fast Provisioning you have a parent image (a.k.a. Shadow VM) and multiple children are linked back to the parent. So you effectively have only one main copy of a VM, each other VM is only a set of configuration files and delta files. The parent image is read only and any changes are written to the delta files. You could think of this similarly to single instance storage in an email system, where only one copy of an email is stored on disk and multiple mailboxes link back to the main copy. If you’re linked clones have a parent image that contains the OS and base applications and don’t change much you could literally have hundreds of copies with only a fraction of the assigned storage used. When using Fast Provisioning each parent image can have up to 30 linked clone images. Once you get past this limit another parent image is cloned and the linked clone chain starts again. There are also considerations when your chains need to cross datastores. All of which is explained in the vCD documentation so I won’t go into it here. The potential storage efficiency gains by using Fast Provisioning’s linked-clone technology are stagering. 10x storage overcommitment is possible based on my experience, which is like reducing your storage investment by 90%, or reducing the cost of storage 90%. But like all things there are tradeoffs and risks.
A couple of things to note about Fast Provisioning. Firstly disk alignment is an issue. Images are initially aligned, and the parent image is aligned, but as the images start to grow the IO’s will be unaligned. This will have an impact on performance. Secondly as the parent image is shared by all the linked clones is on VMFS and you’re using a version of vSphere prior to 5.1 your cluster size will be limited to 8 hosts. This is because the maximum number of hosts that can have the same file open is 8 on VMFS prior to vSphere 5.1. This is not a problem with NFS storage, or with VMFS in vSphere 5.1 and later, both of which support cluster sizes up to the maximum of 32. In saying this though, it’s not going to be an issue for a small environment of a couple of hosts, which is what we’re discussing here.
Both Thin Provisioning and Fast Provisioning can be used together and complement each other. By using both you can massively improve the efficiency of your storage and boost your return on investment. But because of the potential for up to 10x storage efficiency or overcommitment when both thin provisioning and fast provisioning are combined (based on my experience in a number of vCD environments) there are risks that need to be managed. You should seriously consider having burst capacity available to cater for any peak demands and also pay careful attention to how much storage is allocated to each OrgVDC.
Recommendations
If you are using vCD prior to 5.1 then I recommend you implement PvDC’s based on resource pools in your small cluster of 2 hosts so that you can demonstrate the use of multiple classes or tiers of storage. This allows you to keep the environment simple and small for Pilot, PoC, or Lab use. If you are using vCloud Director 5.1 then you don’t have to create multiple PvDC’s and assign them to resource pools. You can simple create a single PvDC and assign the entire cluster to it. You can then use Storage Profiles, Storage Clusters, and Storage DRS to present the different classes or tiers of storage.
For detailed explanation and considerations of using clusters or resource pools to back PvDC’s I recommend you read Frank Denneman’s article Provider VDC – Cluster or Resource Pool.
Size your datastores so you have fewer of them but they are generally large (2TB to 4TB is fairly common). This allows the best storage utilization and efficiency, especially when using Fast Provisioning and Thin Provisioning. Start out with a small number of datastores and then grow as required.
Use thin provisioning to make the most efficient use of your storage, there is no significant performance impact for doing so, but beware of the risk of sudden growth and plan accordingly. Use Fast Provisioning for dev/test workloads and for situations where storage space efficiency is more important than performance. There is a performance overhead to using Fast Provisioning’s linked-clone technology, but the economics are compelling in a lot of cases.
—
This post first appeared on the Long White Virtual Clouds blog at longwhiteclouds.com, by Michael Webster +. Copyright © 2013 – IT Solutions 2000 Ltd and Michael Webster +. All rights reserved. Not to be reproduced for commercial purposes without written permission.
That is really excellent info and it is very helpful. Thank you for putting all of it down.
[…] Michael Webster explains this in detail in this blog post: http://longwhiteclouds.com/2013/02/18/considerations-for-designing-a-small-vcloud-director-environme… […]