A little while ago I wrote an article titled 5 Tips to Prevent 80% of Virtualization Problems. This article was all about storage and how to configure your storage and the dangers to watch out for. This is because problems in virtualized environments are predominantly caused by or related to storage in one way or another. In that article I explained the impact of queue depths on performance and also some of the dangers of making the HBA device queue depths too high. What I didn’t know at the time I wrote the previous article was that the default queue depth for QLogic HBA’s was changed between vSphere 4.1 and 5.x. This article will being you up to date on the changes and the impacts of the change in default values between vSphere 4.x and 5.x.
The HBA device queue depths are important as I outlined in 5 Tips to Prevent 80% of Virtualization Problems because it has a big impact on the number of parallel IO’s that a VM can issue and that can be serviced. It also has an impact on the number of LUNs that you need to support to achieve the same performance. If your queue depth is too low you can have very high latency when your VM’s are trying to issue a high number of parallel IO’s as they’ll all queue up inside the hypervisor. If your HBA device queue depth is too large you could have lots of IO’s either queueing up in the HBA itself, or you could overload your storage array. So you need to strike the right balance.
This article is a follow up to Cormac Hogans article titled Heads Up! Device Queue Depth on QLogic HBA’s that was published in response to some queries VMware had received from one of their Technical Account Managers. I would recommend you read Cormac’s article for the reasons why the change to the defaults was made, as I won’t cover that here.
The following are the default HBA device queue depths when using QLogic HBA’s for FibreChannel or FCoE SAN connectivity:
- ESXi 4.1 U2 – 32
- ESXi 5.0 GA – 64
- ESXi 5.0 U1 – 64
- ESXi 5.1 GA – 64
Note: This change does not effect Emulex HBA’s, only QLogic.
The VMware KB 1267 – Changing the queue depth for QLogic and Emulex HBAs, which documents the process for changing device queue depths for QLogic and Emulex HBA’s has been updated to include the default queue depths for the adapters per vSphere version.
So is this change to the default really significant? I think it’s significant in that it wasn’t documented anywhere and in fact the QLogic HBA documentation still lists the default as 32. It’s also significant due to the impact of an overload condition can be quite a dramatic negative storage performance hit, which could take a while to troubleshoot. But for a very long time it had been a common best practice for VMware to recommend changing the HBA device queue depth on QLogic HBA’s to 64 from the default of 32. In most cases this had a positive impact on performance with reduced IO latencies. If you are using Storage I/O Control it will dynamically adjust queue slots between different VM’s on a shared datastore and you don’t need to worry about the device queue depths.
Storage I/O Control takes away the worry and will adjust performance to ensure the latency thresholds are met (by default 30ms). If you have vSphere Enterprise Plus (4.1 and above) and you have Multiple VM’s per Datastore you should be making use of Storage I/O Control. The device queue depth is used when there is only one VM per datastore and Disk.SchedNumReqOutstanding is used when there are multiple VM’s per datastore, in which case the per device queue depth is ignored. As Paudie O’Riordan, one of VMware’s Senior Staff Technical Support Engineers says “let the computer (SIOC) make the decision, not the finger and the wind”.
However there are a few cases where the queue depth of 64 had a detrimental impact and that was largely when non-virtual systems were sharing the same storage array as the vSphere hosts. In this case the vSphere hosts got a far larger proportion of the array’s IO resources and this could impact the performance of the non-virtual systems. I would recommend that where possible you don’t share storage arrays between your virtual and non-virtual environments, which would avoid these types of impacts. In cases where that is not possible you will need to carefully consider the quality of service and storage IO isolation requirements and impacts that high performance vSphere hosts could have on the overall storage array.
The Queue Depth for all devices on the QLogic HBA is a total of 4096. So if you have a per device (per LUN) queue depth of 32 you can support 128 LUN’s at full queue depth, without queueing in the HBA. If you increase the queue depth to 64 (as is the new default in 5.x) then you can support only 64 LUN’s at full queue depth. You can still have more LUN’s configured based on the assumption that not all LUNs will be using all the queue depth all at the same time, so you can effectively overcommit queues in essence. But it would pay to consider the impact of a large queue depth if all VM’s do start issuing IO’s. As Cormac says in his article “If you hit the adapter queue limit, then you won’t be able to reach the device queue depth, and may possibly have I/Os retried due to queue full conditions.”
Now this is only for the HBA queue depths. What about the target ports or storage processor ports on the array? A lot of array storage processor ports will have a queue depth of 2048. You should check with your storage vendor what the Target Port Queue Depth is for your array, if any. As you can see if the HBA is only configured to issue IO’s to one target port a single HBA could easily overwhelm the storage processor and this could cause a QFULL. Fortunately your design should have LUN’s configured across multiple target storage processor ports and multiple storage processors to reduce the risk of overloading. So what happens in a QFULL scenario? Well you can read the QLogic Document titled Execution Trottle and Queue Depth with VMware and QLogic HBA’s. In essence the vSphere Host will set the queue depth to the minimum, which is 1. You can just imagine what this would do to your performance.
I recommend you read Cormac’s article Heads Up! Device Queue Depth on QLogic HBA’s and read the QLogic document Execution Trottle and Queue Depth with VMware and QLogic HBA’s. Overall the change in default queue depth for QLogic HBA’s should be positive for performance in most environments. In some environments however you may need to adjust the settings to reduce the risks that I have outlined here. It’s far better to be armed with this knowledge than suddenly have your storage performance fall off a cliff and not know what might have caused it. If you have vSphere Enterprise Plus (4.1 or above) and you have Multiple VM’s per Datastore you should be making use of Storage I/O Control.
This post first appeared on the Long White Virtual Clouds blog at longwhiteclouds.com, by Michael Webster +. Copyright © 2013 – IT Solutions 2000 Ltd and Michael Webster +. All rights reserved. Not to be reproduced for commercial purposes without written permission.