20 Responses

  1. Paul Meehan (@PaulPM
    Paul Meehan (@PaulPM at |

    Great Analysis Michael. It's true it can be a "should we or shouldn't we" situation and there is an inbuilt *fear* of jumbo frames, if that's the correct word. So it's very useful to debate the pro's and con's and arrive at a definitive conclusion, which I think at least moves it on from a typical customer scenario. I think your points are fair and reasonable and well argued.

    In my experience, even working with 10Gb/s, it doesn't always get configured Day 1 due to fear of adverse affects on the network, and ironically may get turned on when performance is a problem. It can be also due to the lack of black and white guidance where a vendor might "suggest" turning it on.

    I have seen (even) physical Windows servers performing non-optimally due to native TCP window-size. A particular case in point was with a multi-threaded application replicating data over a WAN, performing dedupe hash-lookup, and then sending the deltas across a WAN. Windows performed optimally after an increase in the number of parallel TCP connections despite the fact it was maybe only 80Mb/s on a 10Gb/s Etherchannel. I know that's an aside but I suppose it supports the view that there are worse things that can happen, and other areas that warrant further tuning, and maybe it's better to just turn it on and take every % available.

    Paul Meehan

    @PaulPMeehan

    Reply
  2. Ryan B
    Ryan B at |

    Does fragmentation or PMTUD work when both end points are on the same subnet?

    Reply
  3. Derek Seaman
    Derek Seaman at |

    At VMworld 2013 I heard both sides of the argument, with different recommendations, in various sessions. One session said that the increased risk by human error in enabling jumbo frames outweighs the slight performance boost you get. Yes initial config of JF is easy, but down the line when switches get replaced or other changes are made, you can goof up the config and breaking NFS/iSCSI storage would be a very bad day.

    In another session they said that jumbo frames are easy to configure, most hardware supports it today, and why not do it? As I recall some first generation HP Flex10 NICs Flex10 modules would not support the full 9K jumbo frame.

    As you point out, there's not right or wrong answer. Personally I use jumbo frames on our UCS blades and non-jumbo frames for our legacy HP blade servers.

    Reply
  4. Jeff Drury
    Jeff Drury at |

    Having worked a support desk for an iSCSI/NFS storage product, I would recommend not having Jumbo Frames as a standard. While Path MTU discover allows devices to still communicate it introduces latency, and with storage packets that is bad. Jumbo frames are a subnet wide configuration, and as such it can be very difficult for less skilled admins to keep configurations consistent. A common configuration error and subsequent storage performance troubleshooting nightmare comes when equipment is replaced. Consider a VSAN device on several physical ESX hosts. If there is a hardware issue on a host and it is replaced by another physical device, the jumbo frames config could be missed. Now you have VSAN nodes talking to each other at different MTU rates. Latency goes up, performance goes down, and frustration on the administration team goes through the roof. To me the issues that I have seen with the configuration are not worth the ~10% performance boost. Also in several tests that I have run the 10% increase from Jumbos is not a blanket statement across IO workloads. Some workloads are the same or slightly decreased performance with Jumbo Frames. Throw Murphys Law and PEBCAK errors in there and in mind it falls into advanced config and not standard config.

    Reply
  5. Charles Gillanders
    Charles Gillanders at |

    There's something not quite right here about Path MTU always working. I've a flat (ie L2 not L3) multi-site bridged network with Jumbo Frames enabled on all paths, when I was switching on Jumbo frames I experienced complete iSCSI traffic stall between devices in different sites (iSCSI replication from Compellent to Compellent). Only once the switch ports for the site-site links had jumbo frames enabled did the replication traffic flow again.

    It's been a while since I did any serious cisco work but I think the problem is occuring here precisely because there's no L3 device between my two Storage controllers: from [http://en.wikipedia.org/wiki/Path_MTU_Discovery]

    For IPv4 packets, Path MTU Discovery works by setting the Don't Fragment (DF) option bit in the IP headers of outgoing packets. Then, any device along the path whose MTU is smaller than the packet will drop it, and send back an Internet Control Message Protocol (ICMP) Fragmentation Needed (Type 3, Code 4) message containing its MTU, allowing the source host to reduce its Path MTU appropriately. The process is repeated until the MTU is small enough to traverse the entire path without fragmentation.

    In this case there are no "devices" along the path so the sender never gets back any ICMP response so it keeps sending jumbo sized packets and just never gets a response.

    The example you provided of devices on the same subnet that do work this out as far as I can figure out actually does just use the same ICMP mechanism.

    I.E. device 1 sends a jumbo frame to device 2, all the L2 switch ports along the way on the local subnet support jumbo frames so the frame gets to device 2 without problems, device 2 isn't configured for jumbo frames so it sends an ICMP message back to device 1 which steps back it's frame size when communicating to device 2 – the key here is that the switches all support jumbo frames and that they have passed the frame successfully, if they don't that's when problems occur.

    Reply
  6. Charles Gillanders
    Charles Gillanders at |

    One more comment: On one of these L2 switched paths we had a Gigabit ethernet port provided to us by a WAN service provider.

    On this path we couldn't get jumbo frames enabled at all until the service provider had enabled them on their network, of course then we couldn't use the maximum size for jumbo frames on our network because once the service provider did their QinQ stuff our max size frames exceeded their max size frames.

    We had to step our max MTU down to 8192 (the next largest commonly accepted size across all of our equipment).

    At the end of the day we got a minor (< 10%) bump in throughput on our iSCSI traffic locally between ESXi and Storage, a bit more of a bump in NFS traffic between Solaris hosts and no apparent change at all for the Compellent replication so it was a good bit of work for not a whole lot of benefit but it does mean that we have implemented "best practice" for both Compellent and EqualLogic so that has helped when making service calls….

    Reply
  7. Josh Odgers (@josh_o
    Josh Odgers (@josh_o at |

    With Nutanix + Arista (where Jumbo is enabled by default on Arista switches) this is a no brainer! Great work Mike!

    Reply
  8. Greg Lenczewski
    Greg Lenczewski at |

    Good findings Michael. Especially when combined with feedback from the community.To mitigate one of the issues mentioned above about equipment failure and replacement equipment being misconfigured, I think having a simple checklist could address some of the misconfigurations from happening in the first place.

    Reply
  9. Brian
    Brian at |

    Very interesting discussion. I will do some further research as a result..

    I'm wondering how Switching technologies impact Jumbo frames i.e. Cut-through/Store-and-forward etc.. Is this a factor in which brand of switches drop Jumbo frames when not configured for them? Also, is it safe to assume that Routers won't drop the Jumbo frames, but rather just perform fragmentation and increment the Jumbo frame counter when viewing the Interface output.. or will they also increment errors? I might do some tests as all this has me curious now..

    p.s. I work for an ISP and we just keep standard 1500 mtu everywhere.

    Reply
  10. foobar13
    foobar13 at |

    Just enabled Jumbo Frames of size 9000 on a aggregated (balance-rr) direct connection over two 1 GbE NICS between two servers.

    Here is the archived stats doing some simple tests copying a directory of 12 GB between them using netcat.

    1500 MTU:
    177 MB/s (1416 MBit)

    9000 MTU:
    223 MB/s (1784 Mbit)

    The teoretica max is 250 MB/s so switching to Jumbo Frames inproved alot compared to standard MTU.

    Lots of limited factors in the test thou. disk I/O, using tar on remote ends which is single thread CPU limited, single TCP-stream, etc.

    Reply

Leave a Reply