The Great vSwitch Debate – Part 6

OK, so the count is up to five posts on vSwitches. If you’ve not read these posts, I recommend that you go back and do so now. The first five posts were:

Now, in Part 6, we finally start talking about host configurations! I started a thread over on the VMTN Community forums for people to provide input about content they would like to see in this series. VMTN user RobVM asked about a configuration with eight pNICs and iSCSI connectivity, so I’ll tackle that first. But before we do, let me lay some ground rules:

  • All networks will have at least two pNICs to provide fault tolerance. While ESX will work just fine without the fault tolerance features, I don’t consider that to be an option for a real live “production” environment. If you’re building out a personal “playground” or a lab environment, feel free to cut the number of pNICs in half, but don’t come crying to me if your one and only network connection fails and your ESX/i host and/or your VMs fall off the face of the network J
  • Unless otherwise specified, all configuration options are set to their defaults values. This is in line with my “Don’t change a default value unless you have a good reason” philosophy (and it makes it easier to describe!)
  • There is a requirement to support two separate “application zones”. These are logically separate networks that could be separated via VLANs or physical separation.

Now, on with the show!

Configuration with Eight pNICs and IP Storage

Eight is a really good number of pNICs if you plan to use IP-based storage. If you remember from Part 5 of this series, there are a variety of networks that we need to support on ESX:

  • VMware Management Network. This is the network that connects vCenter Server to the ESX Service Console. Since ESXi doesn’t have a Service Console, the ESXi Management Network is terminated at the vmkernel.
  • VMotion Network. This network interconnects the various ESX/i (reminder, ESX/i is my shorthand notation for ESX and/or ESXi) hosts within a VMware cluster and enables VMotion among those nodes.
  • IP Storage (NFS or iSCSI) Network. The Network that provides storage for virtual machine and ancillary support (i.e. .iso files).
  • Virtual Machine Network(s). One or more networks that support VM to access and provide services.

Notice that there are four networks – to get the redundancy that I said we were going to implement, multiply four by two and you get eight! (Can you handle this upper level math? It sometimes throws me for a loop!)

So, now we have enough information to throw out a configuration (Figure 1):

Figure 1. Basic Eight pNIC Configuration

Figure 1. Basic Eight pNIC Configuration

As you can see, I’ve laid out four vSwitches, each with two pNIC uplinks configured. The load balancing algorithm is “Route based on virtual switch port ID”. While this is a perfectly sane, robust, and common configuration, there are some negatives associated with this config:

vSwitch0 and vSwitch1 each have two pNICs associated with them; however, there is a single port group and, more importantly, a single service associated with each. This means that each of these two vSwitches will have a pNIC sitting idle, waiting for the active path to fail. In the grand scheme of things, this isn’t such a horrible thing. We’re talking about, what, maybe a thousand dollars worth of “stuff” sitting idle? Between the two pNICs, the associated cabling, and the physical switch ports, that may be a little low, but overall, not a big deal.

vSwitch2 is dedicated to IP Storage. In many cases, this will be either iSCSI or NFS; however, there is nothing to prevent you from running both protocols across this single vSwitch. In fact, many places will do just that – using iSCSI for storing VMFS volumes and NFS for hosting .iso and other “support files”. Notice that if you’re using iSCSI storage, you do need to provide the service console with visibility into that network for authentication purposes.

vSwitch3 is your virtual machine vSwitch. Since we have two application environments, we have defined two different VLANs (by simply adding the VLAN number into the Port Group configuration and ensuring that the pSwitch is configured to trunk the required VLANs). If your policies dictate that these two application environments must not be allowed to comingle on the wire, you would need to add two additional pNICs (to maintain redundancy), and to ensure no comingling, I would recommend a separate vSwitch rather than port groups with active/standby/unused pNIC configurations.

Now, let’s spice it up a little and add in the requirement to support Multipath I/O (MPIO) from within our virtual machines. In order to support MPIO from within a VMware VM, you need to provision the VM with multiple vNICs. In my example, I’m going to configure a VM with three vNICs – one for “regular” network traffic and two to support MPIO access to an iSCSI target. In addition to these direct network connections, the VM will access its system volume as a virtual disk via the vmkernel iSCSI stack (see Figure 2).

Figure 2. Guest OS MPIO iSCSI Configuration

Figure 2. Guest OS MPIO iSCSI Configuration

So, we now have a single VM that is connecting to four separate port groups (although the connection to PG_IPStor is totally transparent – abstracted through the guest OS vSCSI controller). What really makes this VM different is the two extra vNICs that are connected to the PG_VMStor1 & PG_VMStor2 port groups. We will be loading an iSCSI Initiator inside the guest operating system to access an iSCSI target. I won’t go into the details of how to configure the iSCSI initiator to support MPIO, but just trust me when I tell you that you can get better load balancing and better overall iSCSI performance using this approach rather than the native ESX iSCSI access. This improved performance comes with a significant price, though. By using this approach, you now have to manage the iSCSI Initiator within the guest OS and you have to manage the allocation of iSCSI targets to initiators. Additionally, the iSCSI Initiator inside the VM will drive your host’s CPU utilization significantly higher. This (higher CPU utilization) is frequently not much of a problem because many environments are not CPU constrained to start with. The additional management overhead is a big concern (to me).

<soapbox>

Soapbox

 

Soapbox
I strongly encourage you to use this solution ONLY when you find that the standard mechanism (connection via the vSCSI interface) for accessing iSCSI storage does not provide the levels of performance that you need. Remember, just because something is “faster” doesn’t make it “better”! In many cases, I’ve seen people go through the pain of implementing this solution only to find that storage throughput was not their bottleneck, and giving a fatter pipe didn’t help overall performance at all. In many (most?) cases, the performance bottleneck lives in the application and not in the supporting infrastructure. Remember that most “modern” applications were designed years ago when Pentium III and Pentium 4 were rulers of the roost, when 500MHz RAM was blazingly fast, when 100Mbps networks were the norm, and when U160 SCSI on 10,000 RPM drives provided more than enough “speed”.         

 

</soapbox>

Now that I’ve gotten that off my chest…here’s what this solution would look like (Figure 3):

Figure 3. Eight pNIC Configuration with Guest MPIO

Figure 3. Eight pNIC Configuration with Guest MPIO

Wait! What’s that I see on vSwitch1? Could it be? Is it possible? Did Ken really configure active and passive pNICs? Yep, I did! Why? Well, it’s really fairly simple. The default load balancing algorithm is vSwitch Port ID based. If you remember from Part 3 of this series, there are no guarantees about which pNIC your vNIC will associate with. In this case, we need to make sure that the vNICs wind up on different pNICs. At first blush, it would seem that I could default everything and still be OK, but what happens if connect and disconnect a couple of vNICs during operations? Remember that vSwitch ports are statically mapped to pNICs, so if I were to power up three VMs, the first and third each with two vNICs and the second with a single vNIC connecting into vSwitch1, I would have the following (Figure 4):

Figure 4. Virtual Machine Initial vNIC Mapping

Figure 4. Virtual Machine Initial vNIC Mapping

Notice that vSwitch Port #3 is statically mapped to pNIC2. Now, if I power off the second VM (the one with only one vNIC) and power on another VM with two vNICs, I wind up with the configuration shown in Figure 5.

Figure 5. Virtual Machine Secondary vNIC Mapping

Figure 5. Virtual Machine Secondary vNIC Mapping

Notice now that BOTH vNICs for the newly powered on VM are mapped to pNIC1 – NOT what we wanted! If you could guarantee that every VM connected to this vSwitch would have two vNICs and that, at no point, would any of the vNICs be administratively disabled, you could allow the configuration to default. Personally, that’s too many “ifs” for me to trust!

Wow! Not only did I find a justification to break my “always default” rule, I found a good example of why you would want to use active and standby adapters!

Configuration with Eight pNICs without IP Storage

This one is easy. If you take the first example and keep the requirements the same, just remove the need for IP storage and still use eight pNICs, I would do the following (Figure 6):

Figure 6. Eight pNICs with no IP Storage

Figure 6. Eight pNICs with no IP Storage

All we’ve done is split out the two application networks onto separate vSwitches, providing additional bandwidth (probably not needed) and additional fault tolerance. By doing this, we’ve eliminated the need to configure VLAN trunking on the pSwitch and also no need to specify a VLAN number on the PG_App1 and PG_App2 port groups.

References

Microsoft Multipath I/O: Frequently Asked Questions: http://www.microsoft.com/WindowsServer2003/technologies/storage/mpio/faq.mspx

Microsoft iSCSI Users Guide: http://download.microsoft.com/download/A/E/9/AE91DEA1-66D9-417C-ADE4-92D824B871AF/uGuide.doc

Advertisements

Tags: , , , ,

About Ken Cline

vExpert 2009

20 responses to “The Great vSwitch Debate – Part 6”

  1. Edward L. Haletky says :

    Very good, however, the iSCSI server to which the VM is connected should NOT be the iSCSI server to which the ESX hosts are connected. If so then the VMs become an attack point and could be used to intercept iSCSI traffic to and from the ESX hosts. In addition, since iSCSI to ESX requires a service console connection your VMs now could become part of the service console network and create one more attack point.

    • Ken Cline says :

      Valid points, Edward. Thanks for mentioning – security is critical to the design of a successful virtual infrastructure.

      KLC

  2. Cameron Moore says :

    One clarification: the graphics show the PG_IPStor1 port group twice, but they should read as PG_IPStor1 and PG_IPStor2.

    This is a great series of posts, Ken.

  3. Rob D. says :

    Ken,

    Thanks for directly answering my question, that was very generous of you. I came up with a similar diagram with vMotion and CSO on vSwitch0 (why do your diagrams always look better than mine?). Perhaps the next version of ESX will support MPIO, who knows, maybe it’ll be part of the vSphere 4 announcement this morning. I do appreciate your soapbox comment about only using this if you really need it as it does add yet another layer of complexity. I have setup MPIO in physical Win 2003 and Win 2008 along with off-host backups using hardware VSS providers and when you are done you end up with a somewhat complicated setup. I would not want to use it more than absolutely necessary in our VMs.

    Regards,

    -Rob D. (AKA RobVM)

  4. Jason L says :

    Ken,

    Thanks for the great posts. Good and easy reading. I have a question regarding Figure 3. You show there using 1 vSwitch and running both VMotion and the SC on there. I am assuming you would be using Vlan trunking to isolate the networks for each. My question is when would you want to create individual vSwitches as opposed to hosting them on the same vSwitch?

    -Jason

    • Ken Cline says :

      Hi Jason,

      Yes, in Fig 3 there are two port groups and I would recommend the use of different VLANs to logically separate the traffic. Notice that in Fig 1 I have the SC & VMotion on separate vSwitches. In a “perfect world” every network would have at least two pNICs on its own vSwitch. This provides the greatest degree of isolation and also gives fault tolerance. In Fig 3, I combined them onto a single vSwitch (on different VLANs) to provide more pNICs for use by the intra-VM iSCSI initiator.

      So, my recommendation is:

      – If you have enough pNICs to provide maximum isolation and still have the fault tolerance you need, create separate vSwitches.
      – If you don’t have enough pNICs, then combine networks based on the performance/sensitivity matrix approach I discussed in Part 5 (https://kensvirtualreality.wordpress.com/2009/04/17/the-great-vswitch-debate-part-5/)

      You could throw all the pNICs into a single vSwitch and use port group settings to set active/standby/unused adapters, but I don’t like the complexity if that approach. I’m a simple guy and like simple setups – multiple vSwitches make me happy 🙂

      Hope this helps,
      KLC

  5. Daern says :

    Ken,

    Many thanks for this set of articles, which I’ve just spent an hour or so reading. Very informative indeed and provides an excellent insight into the numerous configuration options for managing networking in VMWare.

    One observation: In these days of reducing IT cost and complexity, having an eight pNIC server for a non-IP storage solution seems, well, a little extravagant, especially when only half of the ports are devoted to serving “front facing” traffic. Combining management (low traffic, high availability) with VMotion (high traffic, low availability) on separate port groups seems a very logical solution, reducing the number of NICs and, almost more importantly, switch ports required by the solution with very few drawbacks.

    Excellent stuff!

    Daern

  6. Massimo Re Ferre' says :

    Ken,

    100% agreed with the management vs performance statement. To makes things worse from a management perspective it must be noticed that if you use the guest-based iSCSI functionalities you do need a “local” boot/disk (c:\) coming from your non-iSCSI storage and then you can map an additional disk (d:\) that you can apply your advanced MPIO algorithms to. Very big mess in my opinion.

    Massimo.

  7. Ian says :

    Hi Ken,

    Excellent articles. I’ve learnt alot! Thankyou

    with regards to IP storage, how do you setup Active and Passive pNIC’s outlined in your example? I’m hoping to do this with my setup.

  8. Dan says :

    Nice work Ken.

    I’m coming to the debate a bit late, but I have a very similar set up to your examples with 8 pNICs with IP storage (iSCSI). The issue I have is that we have thrown FT in to the mix. It is has similar recommendations to VMotion. The solution we are thinking of using is a vswitch with 3 pNICs and 3 PGs, then use Active/standby adapters for the service console, VMotion and FT.

    The VM vSwitch would have 3 pNICs as well (default config), and the iSCSI vSwitch would have 2 pNICs.

    Any comments would be appreciated.

    Regards
    Dan

    • Ken Cline says :

      Sounds like a good plan. You’ve isolated traffic pretty well and also provided a level of fault tolerance with the standby pNICs. For your iSCSI config, have you read the “Multi-Vendor iSCSI” whitepaper (here)? Best information available for IP storage!

      Good luck,
      KLC

  9. Daniel Myers (@Dmyers_) says :

    Hi. Wow great set of articles. A couple of questions
    After figure 4, it says “Notice that vSwitch Port #3 is statically mapped to pNIC2.” However the diagram is mapping it to pNIC1. Am I miss reading this? (it is late here).

    Now in Fig 5, where we newly pick up port 6, why does it map to pNic1 when using the 2nd PG which has active set to pNic2?

    Thanks, I’m keen to understand all this forwards and backwards! 🙂

  10. Todd D says :

    Ken,

    Wondering if you plan to update the “great debate” any time soon with a perspective on 10Gb and vDS? With many folks considering a converged network strategy and reducing the cabling complexity there is a need to address the Virtual Switch design mantra of multple vSwitch with redundant 1Gb vNic’s. I’m particularly interested in the performance impacts of reducing vSwitch count on ESX/i along with reported race conditions when combining service console into same vSwitch.

    Regards,
    Todd

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: