Over the last couple of weeks the debate about whether or not to enable Jumbo Frames has been coming up quite a bit. This has been largely driven by discussion around VSAN in vSphere 5.5 and other types of network based storage access. A couple of people have been questioning the wisdom of having “enable Jumbo Frames as a best practice” recommendation for VSAN or not, due to a perceived negligible benefit when compared to the perceived complexity involved in implementing it (I will drill into these concerns below). I was quick to point to various test results showing at least a 10% benefit in performance, including my previous articles Jumbo Frames on vSphere 5 and Jumbo Frames on vSphere 5 Update 1. However my previous testing was not for storage access, it was simply network performance. So I thought maybe there is a difference when you’re using NAS storage, or VSAN type storage over a 10G network, and maybe Jumbo Frames doesn’t make all that much difference in that scenario. So I thought I’d test some storage access scenarios over my 10G LAN in My Lab Environment to see if Jumbo Frames made any difference or not.
I’ve heard reports that some people have been testing VSAN and seen no noticeable performance improvement when using Jumbo Frames on the 10G networks between the hosts. Although I don’t have VSAN in my lab just yet my theory as to the reason for this is that the network is not the bottleneck with VSAN. Most of the storage access in a VSAN environment will be local, it’s only the replication traffic and traffic when data needs to be moved around that will go over the network between VSAN hosts. The latency introduced by the network in those cases would be negligible compared to the cost of accessing the local host storage. Then there is the argument that LRO/LSO on modern 10G NIC’s negates the benefit of Jumbo Frames. The thing is LSO only helps with outbound and LRO doesn’t deal entirely with the overheads associated with per packet processing of the inbound packets. At least Linux has LRO support whereas Windows doesn’t yet. VSAN storage traffic isn’t all that is going across the network.
Does any of this really matter when it comes to setting best practices? Not entirely. But to explain that, first we need to look at what best practices actually are. Let’s take a look at what the VCDX Boot Camp book by John Arrasjid, Ben Lin and Mostafa Kahlil, and my quote on page 20 of that book says about best practices.
“Use of best practices may apply for a majority of implementations, but these are not customer specific or applicable in all situations. A qualified design expert knows when to deviate from best practice while providing a justifiable and supportable solution” I go on to elaborate on that point by saying “Best practices are a baseline from which we work in the absence of specific requirements that would justify deviation. Knowing why it is a best practice is important so that you know where to create a new best practice specific to your design and customer.”
So what we can immediately take form the above is that best practices while beneficial to them majority of implementations may not be applicable in all situations. This is the case with Jumbo Frames as there will be a lot of dependencies. If the network or CPU is not the bottleneck then it’s unlikely Jumbo Frames by itself will be a silver bullet for your application or storage access performance issues. It will also depend on the switching infrastructure and NIC’s in use. But for the majority of situations Jumbo Frames could be beneficial, especially when using 10G or higher bandwidth network infrastructure. As you’ll see shortly from my test results Jumbo Frames improves performance significantly in the situations I tested in My Lab Environment.
Next we need to look at what the trade offs are. If I enable Jumbo Frames for VSAN or other types of traffic are there any downsides? Well firstly it’s got to be enabled from end to end in the network communication path in order for it to be effective. So that means the VM’s, VMKernel ports transmitting VSAN or vMotion traffic, the virtual switches, and the physical switches or routers need to be configured to accept Jumbo Frames. Why 10G plus equipment doesn’t come out of the factory configured to accept Jumbo Frames I don’t know. In any case the necessary configuration is a trivial exercise when setting up new infrastructure, but to retrofit existing network switches, routers and the virtual environment on a large scale if it wasn’t done originally can be a little harder and more complex. But in the context of 10G+ storage networks and vMotion networks, which are meant to be flat and closely connected, I would argue it isn’t that much trouble. But we must accept there is more to configure. This is one tradeoff.
But what if you enable it for only part of the network path or you make a mistake with the Jumbo Frames configuration? This is a very common concern or objection to enabling Jumbo Frames. In this case you’re going to get packets fragmented to the standard frame size, as if Jumbo Frames wasn’t enabled at all. But thanks to Path MTU Discovery that’s pretty much it. So you’re no worse off by setting Jumbo Frames than you would be if it were not set, so no harm done. Jumbo Frames will just not be used. Because really all configuring Jumbo Frames does is enable an upper limit for the Maximum Transmission Unit larger than the standard frame size (normally 1500B). It has no impact on the minimum size of a frame that is sent. If you do make a mistake troubleshooting where it’s gone wrong isn’t that difficult either with vSphere 5.1 and above as you can use the network health check or ping’s with varying TTL’s with the no fragment option (8972B packet size) to find out which part of the network path is incorrectly configured.
So from the above I would argue it’s reasonable to keep Jumbo Frames as a best practice recommendation when using Ethernet based storage access on 10G plus networks, even with VSAN and other technologies that behave similarly. It won’t cause any harm, and in many situations it will be of benefit. As bandwidth scales Jumbo Frames may provide even more benefit, depending on how NIC’s and switches develop. But just how much benefit? Is it really worth it? Now it’s time to review my test setup and my test results.
For my testing I used two of my ESXi 5.0 Update 2 Hosts with the following config: Dell T710’s with 2 x X5650 CPU’s (6 cores per socket, 2.66Ghz), 72GB RAM, Intel 520-T2 10G NIC. There were no advanced settings (such as interrupt coalescing) changed on the hosts with regards to the 10G NIC’s. The Hosts NIC’s are at the default settings. Both hosts were connected to a Dell 8024 10G switch. The 8024 switch is configured for Jumbo Frames (MTU 9216), the vDS that my test VM’s are connected to is also configured to allow Jumbo Frames (9000). So during the tests I was changing the endpoints to either accept or not accept Jumbo Frames. For the IO Load Generator and Storage Server I used 2 x VM’s, one on each host, configured with 6 vCPU’s and 8 GB RAM. The VM’s were configured with VMXNET3 vNIC’s and Interrupt Moderation was disabled on the vNIC driver within the OS. The VM’s used PVSCSI vSCSI adapters with the default settings.
Each VM was on a different host during the testing. To drive the storage load I used IOMeter. The storage server had a single thin VMDK backed by either a Micron or Fusion-io Flash PCIe card and this VMDK. The IO workload pattern was 100% sequential read with 32 outstanding IO’s. I only varied the IO size and whether or not Jumbo Frames were used during each test run, as the test were primarily to see the impact of the network configuration and not of different storage IO patterns. I took multiple test runs and the results are the lowest of the runs for each IO size. All measurements are taken from the IOMeter Logs on the IO Load Generator VM.
Here are my results, your milage may vary:
You can see from the results as the bandwidth of the 10G NIC’s reaches saturation point (64K IO Size) there is almost no difference between Jumbo and Non-Jumbo in terms of throughput and latency. However up until that point there is between a 9% and 23% improvement in performance due to Jumbo Frames on IOPS (and throughput), and between 9% to 32% improvement in latency. I also found that the CPU cost using Jumbo Frames was lower than with Non-Jumbo. To achieve the same throughput and latency in the 64K IO Size test the client used 40% more CPU with Non-Jumbo than with Jumbo (13.9% CPU Utilization Non-Jumbo vs 9.9% CPU utilization with Jumbo). However the offload capabilities of the 10G NIC did equalise the CPU cost for the Non-Jumbo Tests. Even during very high packet and throughput rate of the Non-Jumbo tests the CPU utilization did not exceed 16% on the IO Load Generator VM.
One of my readers was also kind enough to supply some test results comparing Jumbo Frames to Non-Jumbo Frames for NFS based storage on a Cisco 2020 10G infrastructure. Here is the graph showing a consistent 11% performance improvement in terms of IOPS for their testing. 8204 is the Non-Jumbo Tests and 8205 is the Jumbo tests.
In addition to enhanced IOPS the test results showed that in the case of using Jumbo Frames IO’s were serviced 85% within 500us, vs 65% within 500us for Non-Jumbo. The CPU utilization on the NFS filer was 60% during the Jumbo test vs 80% during the Non-Jumbo Test. These tests demonstrate higher IO throughput, lower latency and lower CPU utilization as the majority of my tests did. This is further evidence to consider using Jumbo Frames and in support of it being a best practice for Ethernet based storage and high throughput 10G plus networks.
Based on my test results and findings above I would recommend Jumbo Frames is enabled on 10G + Networks, especially when using Ethernet based storage access, with vMotion, and with other high throughput applications (Oracle RAC Interconnects). This includes when using technologies like VSAN. I see no harm in this being recommended as a best practice provided customers and partners understand what best practices are and also understand Jumbo Frames. If you’re designing a new infrastructure around NAS, VSAN or other Ethernet based storage access then there is very little overhead in including Jumbo Frames up front. The applicability in my view of Jumbo Frames only going to increase with the adoption of network overlay or network virtualization technology such as with VMware NSX, VXLAN etc. So even if you think you can get away without have Jumbo Frames now, it’s very likely to be in your future.
I would be interested in your feedback and also interested in any other test results using Jumbo Frames on 10G or higher bandwidth networks and with NAS and VSAN type storage environments.
This post first appeared on the Long White Virtual Clouds blog at longwhiteclouds.com, by Michael Webster +. Copyright © 2013 – IT Solutions 2000 Ltd and Michael Webster +. All rights reserved. Not to be reproduced for commercial purposes without written permission.