LACP Configuration for VMware Distributed Switch and Dell Force10 OS

In the article Configuring Scalable Low Latency L2 Leaf-Spine Network Fabrics with Dell Networking Switches I wrote about the general set up of the leaf spine architecture in my Nutanix performance lab with Dell Force 10 switches. Then in VMware Distributed vSwitch LACP Configuration with Dell Force10 and Cumulus Linux I wrote about the VMware vDS and Cumulus Linux configuration on Dell Force 10 networks switches. This article will cover the port-channel configuration for hosts connected to Dell Force 10 switches running the Force 10 OS (FTOS).

As with the configuration of the previous articles each host is configured with 2 x 10GbE Ports and each port is connected to a different Top of Rack (ToR) or Middle of Rack (MoR) switch that is part of a MLAG/CLAG/VLT pair.

Below is the physical port configuration, which is similar to a standard port, other than having the additional port-channel specification. It should refer to the same port-channel number on both ToR/MoR switches.

interface TenGigabitEthernet 0/0
description Node-ntnx3450a1
no ip address
mtu 9216
flowcontrol rx on tx off
!
port-channel-protocol LACP
port-channel 1 mode active
no shutdown

The port-channel config should be the same for each host on both ToR/MoR switches. The parameter vlt-peer-lag port-channel 1 specifies that this port is the same as port-channel 1 on the peer switch.

interface Port-channel 1
description Node-ntnx3450a1
no ip address
mtu 9216
portmode hybrid
switchport
spanning-tree rstp edge-port bpduguard shutdown-on-violation
spanning-tree 0 portfast bpduguard shutdown-on-violation
lacp fast-switchover
vlt-peer-lag port-channel 1
no shutdown

Final Word

It is fairly simple to set up port-channels for redundant LACP configurations from hosts to VLT/MLAG ToR/MoR Dell Force 10 switches and FTOS. Although this does mean additional configuration is required for every single host port on every switch. The main use case for this is where you are likely to run something like VMware NSX and wish to be able to utilize multiple links, as NSX does not support load based teaming. For most normal enterprise virtualization use cases Load Based Teaming provides the best balance, best utilization and simplest configuration of both the vSphere Distributed Switch and the physical switches as well.

—

This post first appeared on the Long White Virtual Clouds blog at longwhiteclouds.com. By Michael Webster +. Copyright © 2012 – 2015 – IT Solutions 2000 Ltd and Michael Webster +. All rights reserved. Not to be reproduced for commercial purposes without written permission.

10 Responses

Nick Cutting January 14, 2016 at 10:50 pm | Permalink

While not recommending VLT/LACP down to the hosts, for ESX hosts not using NSX – do you still recommend using VLT up to the spine/core switches? I plan on connecting 2 4820t’s up to a cisco 68k VSS pair, and in Nexus world, we could do a “back-to-back” style VPC – is there something similar available for VLT (I understand the VSS side would just be a single control plane multichassis etherchanel, with 2 ports on each 68k)

Any thoughts?

vcdxnz001 January 14, 2016 at 11:06 pm | Permalink

Yes. Absolutely. Recommend VLT/VPC. VLT/VPC in the spine and leaf for smaller environments and VLT leaf / L3 spine for larger environments. VSS just turns two switches in effect into one, like stacking. So you should be able to do LACP northbound. But you may want to consider L3. With VLT config the LAG between switches is recommended as static as well, rather than LACP. Hope this helps. The main reason for not using LACP to hosts in most cases is just the amount of config required. With automation though that concern can be reduced. One other thing to consider is orphaned ports in an error condition. So using the equivalent of link state tracking might be a good idea also.

Reply

Nick Cutting January 14, 2016 at 11:43 pm | Permalink

Thank you, yes I am going to thoroughly test all failure scenarios. Definitely will use the static LAG between the Dell switches. I am a little unsure about the QSFP to 10Gig breakout cables, as to how they will interoperate with the cisco 6880 chassis 10gig ports. one reason being that although not a problem now, inter-rack leaf/spine could be an issue if the 10 gig end is like a twixax calbe rather than a transiever that accepts fiber. The 4x10Gig to QSFP cable for the cisco 6880 is not available until April, and the code required to run it possibly the end of the year, so its 10 gig uplinks for now. As for layer 3 to the leaf – this is not going to fly with the server chaps, at least not this year.

@vcdxnz001 January 15, 2016 at 1:49 pm | Permalink

Yeah, need to be a bit careful with the break out cables when connecting to Cisco. Cisco really only likes to use their own cables and transceivers, so you may need to issue the command service unsupported-transceiver. Best to speak to your Cisco rep or TAC about that. Across racks, fibre is definitely recommended. twinax is severely limited in terms of distance. I use breakout cables in my lab, but I've got all Dell switches. So it works just fine.

Reply

Rob May 4, 2016 at 10:01 am | Permalink

Great Blog. I have been doing some proof of concept work with 2 Dell S4048-ON switches. I am incorporating the VLT domain (similar to you), and now trying to add some hosts into the mix.

I have a couple of powerful desktop computers with 10G qlogic NICs for testing. I am trying to figure out what I am doing wrong as the example you have posted (configured on both switches), leads to my Port Channel 1 reporting – Port-channel 1 is up, line protocol is down(minimum links not up).

There are 4 desktops ESX1 – ESX4 respectfully, running v6 with vDS. All ESX hosts has an SFP+ connection to both S1 and S2 via twinax. Since the ESX host is connected to both swtiches, and both switches are a VLT pair, what am I missing?

Cheers,

Luis August 13, 2016 at 7:56 am | Permalink

Hello, awsome blog. We are having the same problem as Rob, we have VLT in 2 Dell S4048-ON but when we create a port-channel with LACP to connect a ESXI host, the port-channel did not work and says protocol down (minium links not up) Do you know what can be wrong?

Regards.

vcdxnz001 August 13, 2016 at 9:28 am | Permalink

Cable plugged into the wrong ports or admin shut down. What does LLDP show?

Reply
rob August 13, 2016 at 9:37 am | Permalink

I had figured this out shortly after posting. Considering the initial lack of response, I didn’t bother updating my comment.

Long story short, the virtual distributed switch needs to be completely setup before you can take advantage of the port channel – or at least for it be seen in vmware.

Obviously I was continually doing something wrong that broke management connection. This is partly due to my mis configuration of tagged vs untagged which was different than what I was used to. So temporarily, I used a single gb connection for all management. When I was confident that everything was good, I created the 10gb lagg and migrated all port groups to use it.

My Dell servers have a 4 port gb nic and 2 x 10gb qlogic adpters. Everything is working as expected now.

Reply
1. vcdxnz001 August 13, 2016 at 9:40 am | Permalink
  
  Great news. Setting up vDS and validating the environment before starting is always a good idea. LACP is a lot more complex to set up and manage so is best used only when necessary. Glad you got it sorted.

Luis August 17, 2016 at 12:00 am | Permalink

vcdxnz001 and rob, how are you?

We did a pannic question the other day, but we also solve the problem, and also was part of vmware and the distributed switch configuration. Once it was properly configured, the LACP port-channel goes up correctly in the switches. We have LACP port-channel and vlan tagging properly working.

Regards

1128610 Responses2015-11-29+16%3A09%3A18Michael+Webster

Nick Cutting January 14, 2016 at 10:50 pm | Permalink

While not recommending VLT/LACP down to the hosts, for ESX hosts not using NSX – do you still recommend using VLT up to the spine/core switches? I plan on connecting 2 4820t’s up to a cisco 68k VSS pair, and in Nexus world, we could do a “back-to-back” style VPC – is there something similar available for VLT (I understand the VSS side would just be a single control plane multichassis etherchanel, with 2 ports on each 68k)

Any thoughts?

1. vcdxnz001 January 14, 2016 at 11:06 pm | Permalink
  
  Yes. Absolutely. Recommend VLT/VPC. VLT/VPC in the spine and leaf for smaller environments and VLT leaf / L3 spine for larger environments. VSS just turns two switches in effect into one, like stacking. So you should be able to do LACP northbound. But you may want to consider L3. With VLT config the LAG between switches is recommended as static as well, rather than LACP. Hope this helps. The main reason for not using LACP to hosts in most cases is just the amount of config required. With automation though that concern can be reduced. One other thing to consider is orphaned ports in an error condition. So using the equivalent of link state tracking might be a good idea also.
  
Nick Cutting January 14, 2016 at 11:43 pm | Permalink

Thank you, yes I am going to thoroughly test all failure scenarios. Definitely will use the static LAG between the Dell switches. I am a little unsure about the QSFP to 10Gig breakout cables, as to how they will interoperate with the cisco 6880 chassis 10gig ports. one reason being that although not a problem now, inter-rack leaf/spine could be an issue if the 10 gig end is like a twixax calbe rather than a transiever that accepts fiber. The 4x10Gig to QSFP cable for the cisco 6880 is not available until April, and the code required to run it possibly the end of the year, so its 10 gig uplinks for now. As for layer 3 to the leaf – this is not going to fly with the server chaps, at least not this year.

1. @vcdxnz001 January 15, 2016 at 1:49 pm | Permalink
  
  Yeah, need to be a bit careful with the break out cables when connecting to Cisco. Cisco really only likes to use their own cables and transceivers, so you may need to issue the command service unsupported-transceiver. Best to speak to your Cisco rep or TAC about that. Across racks, fibre is definitely recommended. twinax is severely limited in terms of distance. I use breakout cables in my lab, but I've got all Dell switches. So it works just fine.
  
Rob May 4, 2016 at 10:01 am | Permalink

Great Blog. I have been doing some proof of concept work with 2 Dell S4048-ON switches. I am incorporating the VLT domain (similar to you), and now trying to add some hosts into the mix.

I have a couple of powerful desktop computers with 10G qlogic NICs for testing. I am trying to figure out what I am doing wrong as the example you have posted (configured on both switches), leads to my Port Channel 1 reporting – Port-channel 1 is up, line protocol is down(minimum links not up).

There are 4 desktops ESX1 – ESX4 respectfully, running v6 with vDS. All ESX hosts has an SFP+ connection to both S1 and S2 via twinax. Since the ESX host is connected to both swtiches, and both switches are a VLT pair, what am I missing?

Cheers,

Luis August 13, 2016 at 7:56 am | Permalink

Hello, awsome blog. We are having the same problem as Rob, we have VLT in 2 Dell S4048-ON but when we create a port-channel with LACP to connect a ESXI host, the port-channel did not work and says protocol down (minium links not up) Do you know what can be wrong?

Regards.

1. vcdxnz001 August 13, 2016 at 9:28 am | Permalink
  
  Cable plugged into the wrong ports or admin shut down. What does LLDP show?
  
2. rob August 13, 2016 at 9:37 am | Permalink
  
  I had figured this out shortly after posting. Considering the initial lack of response, I didn’t bother updating my comment.
  
  Long story short, the virtual distributed switch needs to be completely setup before you can take advantage of the port channel – or at least for it be seen in vmware.
  
  Obviously I was continually doing something wrong that broke management connection. This is partly due to my mis configuration of tagged vs untagged which was different than what I was used to. So temporarily, I used a single gb connection for all management. When I was confident that everything was good, I created the 10gb lagg and migrated all port groups to use it.
  
  My Dell servers have a 4 port gb nic and 2 x 10gb qlogic adpters. Everything is working as expected now.
  
  1. vcdxnz001 August 13, 2016 at 9:40 am | Permalink
    
    Great news. Setting up vDS and validating the environment before starting is always a good idea. LACP is a lot more complex to set up and manage so is best used only when necessary. Glad you got it sorted.
Luis August 17, 2016 at 12:00 am | Permalink

vcdxnz001 and rob, how are you?

We did a pannic question the other day, but we also solve the problem, and also was part of vmware and the distributed switch configuration. Once it was properly configured, the LACP port-channel goes up correctly in the switches. We have LACP port-channel and vlan tagging properly working.

Regards

all things Nutanix, VMware, cloud and virtualizing business critical applications

LACP Configuration for VMware Distributed Switch and Dell Force10 OS

Like this:

1128610 Responses2015-11-29+16%3A09%3A18Michael+Webster

Leave a Reply to vcdxnz001Cancel reply

Share this:

Like this:

1128610 Responseshttp%3A%2F%2Flongwhiteclouds.com%2F2015%2F11%2F30%2Flacp-configuration-for-vmware-distributed-switch-and-dell-force10-os%2FLACP+Configuration+for+VMware+Distributed+Switch+and+Dell+Force10+OS2015-11-29+16%3A09%3A18Michael+Websterhttp%3A%2F%2Flongwhiteclouds.com%2F%3Fp%3D11286

Leave a Reply to vcdxnz001Cancel reply

1128610 Responses2015-11-29+16%3A09%3A18Michael+Webster