The Great vSwitch Debate – Part 4

OK, we’re now up to Part 4 in this series of articles. With a title like “The Great vSwitch Debate” I bet you’re wondering when the debate’s going to start – well, not yet. I’ve still got a few more details to cover about what makes a vSwitch tick before I can really get into the discussion of what’s the best way to configure your vSwitches.

So far, we’ve been through three posts on vSwitches. If you’ve not read these posts, I recommend that you go back and do so now (or you can read this post and then go back – there are not many dependencies). The first three posts were:

So, what does that leave for Part 4? Plenty! In this edition, we’re going to talk about how a vSwitch detects path failures and also dip our toes into the Cisco Discovery Protocol waters. Now, on to the next topic!

Network Path Failure Detection

OK, so we’ve moved the responsibility for fault tolerance from the virtual machine’s guest OS and placed it squarely on the vSwitch. This begs the question – how does a vSwitch know that a failure has occurred, and what does it do about it?

Well, let’s see if we can answer those questions!

ESX supports two types of network failure detection mechanisms, beaconing and link state detection. Let’s investigate beaconing first.

Beacon Probing

Beacon Probing, frequently called simply “beaconing”. Beaconing is intended for use in situations where there are multiple pSwitches between the various pNICs pNICs in a vSwitch. Beaconing is a technique that sends layer two Ethernet broadcast packets from every pNIC in the team to every VLAN to which the vSwitch belongs (yes, that means that if your vSwitch participates in 10 VLANs, beaconing will transmit 10 broadcast packets per pNIC per beacon interval!), although you do have the option of overriding the vSwitch default settings at the Port Group level, if you desire.

See Figure 1 for a depiction of a single pNIC transmitting beacon packets to the broadcast address (in normal operation, every pNIC will transmit beacon packets on every connected VLAN). This packet would be received by all other pNICs in the broadcast domain. An interesting note is that this is the only situation when, during normal operation, you will see the MAC address of the pNIC on the network. The MAC address or the Beacon Initiator pNIC is used as the source MAC address in the beacon frame. It is also of interest that the vSwitch will absorb the beacon frames received by the Beacon Receivers – the virtual machines will never see the beacon frames.

Figure 1. Beacon Probing

Figure 1. Beacon Probing

 

If a Beacon Receiver misses three consecutive beacon packets, it will flagged as “bad” and put into a down state. Outbound vSwitch traffic will automatically be routed over surviving interfaces. There are two basic behaviors that the vSwitch will exhibit upon beacon failure (from http://blogs.vmware.com/networking/2008/12/using-beaconing-to-detect-link-failures-or-beaconing-demystified.html):

ESX behavior when a beaconing failure is detected is as follows:

  1. If two or more uplinks receive beacons from each other, those uplinks are considered good. We stop using uplinks which do not receive any beacon packets.
  2. On ESX 3.5, if no uplink receives beacon packets, traffic is sent to all uplinks (shotgun mode). If a team has two uplinks, any link failure will result in all packets being sent to both uplinks.

As you see, you can wind up in a situation where you are transmitting all packets along every uplink path. This can cause extreme confusion for your pSwitches, especially if you have multiple uplinks connected to the same pSwitch (not recommended when using beaconing)!

There are other issues with Beacon Probing, too. For example http://kb.vmware.com/kb/1004373 reads as follows:

“When configuring networking for an ESX Server host using at least two vmnic as network adapters and VLAN Type 4095, duplicated packets can occur when Beacon Probing is selected in the Network Failover Detection dropdown menu.”

With the following recommended solution:

“Select the Link Status Only option in the Network Failover Detection dropdown menu instead of Beacon Probing.”

Basically, Beacon Probing should not be used as an alternative to a robust Layer Two network implementation. Instead, use Link Status Only as your network failure detection mechanism.

Link Status Only

Link Status Only error detection relies upon the pNIC’s link state detection capabilities to identify when there is a problem with the network path. At first blush, you may think this is not a very robust error detection scheme – it only sees the condition of the connection between the pNIC and the first upstream pSwitch! It also does nothing to verify that the pSwitch port is configured correctly, and it can’t see deeper into the network. While all this is true, there are some things that can be done to help alleviate these problems.

First, use a pSwitch that has the “Link State Tracking” feature. This feature will mirror the link status of the upstream link (pSwitch to pSwitch) down to the downstream link (pSwitch to ESX pNIC). What this means is that if the first upstream pSwitch becomes isolated, the link status indicator for the pNIC will be set to “Down”, indicating to the vSwitch that the associated path is no longer viable.

Figure 2. Link State Tracking

Figure 2. Link State Tracking

Figure 2 shows an example of how Link State Tracking works. When looking at the Downstream Link (from the first pSwitch’s perspective), everything looks good; however, the Upstream Link between the two pSwitches has failed. With Link State Tracking enabled on the pSwitch, will reflect the state of the Upstream link on the Downstream link, letting ESX know that the path is dead and that the alternate path should be used.

Which to Use?

So, you’ve got two options for managing network failure conditions, which should you use? I recommend the following:

  • When you initially configure your vSwitch, use Beacon Probing. This will allow you to test not only the link state, but also ensure that you can talk across all of your configured port groups. Once you’ve validated proper configuration, switch to Link Status Only
  • When you add a new port group to an existing vSwitch, set the error detection method for the port group to Beacon Probing to verify correct pSwitch configuration. Once you’ve validated proper configuration, switch to Link Status Only
  • Don’t use Beacon Probing if more than one pNIC in the vSwitch is connected to the same pSwitch. This could result in the same MAC address being presented on two or more ports on the pSwitch which is “a very bad thing”.
  • Use Link Status Only for network failure detection. If at all possible, use pSwitches that support Link State Tracking to reflect upstream network status back to the vSwitch.
  • Implement a robust Layer Two network. If possible, have your first level pSwitch multi-homed to eliminate single points of failure.

Cisco Discovery Protocol (CDP)

The Cisco Discovery Protocol (CDP) is used to obtain pSwitch port configuration from an ESX host. The information returned by CDP can be invaluable when you’re trying to verify or modify your network configuration. Examples of some of the information returned by CDP include:

  • Identification of the pSwitch to which a pNIC is connected
  • Identification of the pSwitch port to which the pNIC is connected
  • Speed & Duplex settings for the pSwitch port
  • VLAN number(s) associated with the pSwitch port

The CDP information is available either via the command line (use vmware-vim-cmd) or via vCenter Server (on the Configuration / Networking tab). For more detailed information on the use of CDP, check out the following VMware knowledgebase articles:

http://kb.vmware.com/kb/1007069

http://kb.vmware.com/kb/1003885

OK, this wraps it up for Part 4 of my series on vSwitches. In the next section (Part 5), I’ll start getting into some of my recommended configurations. That’s where the real fun begins – and hopefully we’ll get the “Debate” part of this thing spun up 🙂

References:

Beaconing Demystified: Using Beaconing to Detect Link Failures

VMware Virtual Networking Concepts

Duplicated Packets Occur when Beacon Probing Is Selected Using vmnic and VLAN Type 4095

Cisco Discovery Protocol (CDP) network information via command line and VirtualCenter on an ESX host

Configuring the Cisco Discovery Protocol (CDP) with ESX Server

Advertisements

Tags: , ,

About Ken Cline

vExpert 2009

16 responses to “The Great vSwitch Debate – Part 4”

  1. Rob D. says :

    Ken,

    Please post part 5, I’m on the edge of my chair! Your vSwitch article is excellent, and explains a lot of information in a small space that is very useful to me (and I’m sure many others). Keep up the good work!

    -Rob

  2. Duncan says :

    Also, don’t use beacon probing when you only have 2 nics… there’s no way of detecting which nic failed with only two.

    Great article again Ken!

  3. Daern says :

    Good article, thanks.

    One thing to note is that, while link state tracking does work for some scenarios, if you have a glitch (as we have done) where a host pNIC simply stops passing traffic, any VMs on this pNIC will die and remain dead until you manually intervene.

    Likewise, if your helpful switch admin makes an error in the VLAN config on the switch and (for example) deletes a couple of VLANs from a single port, again this will result in an extended outage for your VMs.

    Beaconing will catch both of these issues, although as I’m finding now, *knowing* that this has happened is something of a challenge…

    • Ken Cline says :

      Hi Daern,

      Thanks for the comment. I agree that there are situations where link state alone will not catch a problem – I still prefer not to use beaconing on an ongoing basis. It just goes against my “keep it simple” mantra. It’s a great troubleshooting tool and should be accordingly, but if you’re having success with your config – go with it 😉

      Thanks again!
      KLC

    • akopel says :

      We are also considering beaconing for the same reasons Daern mentions. Specifically, we have had various flaky nic driver issues where link state is ‘up’ but the nic fails to pass traffic.

      I would be curious to hear from Daern now over a year later to see where you sit on the issue now. (are you still using beaconing).

      Also, did you ever find an ‘elegant’ way to ‘detect’ when a beacon failure occurs? especially in a 2 pNIC situation.

      akopel

  4. Sebas says :

    Hi Ken,

    Great article !
    We have had some occasions when “link-state” wasn’t enough so i’m looking into beaconing now.
    i don’t understand the “vlan type 4095” part from the Article :

    “Duplicated Packets Occur when Beacon Probing Is Selected Using vmnic and VLAN Type 4095 : http://kb.vmware.com/kb/1004373

    Will this only occur on vlan 4095 ?
    Cause if not, im considering to enable beaconing but setting the interval to a much larger amount, so we won’t be flooding the physical switches.

    Keep up the good work !

    Sebas

    • Ken Cline says :

      Yes, it is only with VLAN 4095, which allows the associated port group to see all packets.

      You should write a blog post about your experience with “link-state” not meeting your needs – especially if you were using link state tracking!

      Thanks,
      KLC

  5. NiTRo says :

    Ken, i found that the “link status only” method can be tunned in order to use speed, duplex and errors of the vmnics with the command vim-cmd hostsvc/net/vswitch_setpolicy (and portgroup_set) to detect failure

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: