- One Role Per PortGroup – Using each port group for a single role would be an ideal scenario while designing networks for a vSphere environment. In essence, you should have one VMkernel (Management) PortGroup each for Management Network, Secondary Management Network (if available), FT, vMotion & vSphere Replication (this option is still not active in vSphere, although can be seen through vSphere Web Client). This will ensure that you assign individual VMK IDs to each function which not only makes your network settings simple but easy to troubleshoot as well. For Virtual Machine Port Groups you have to use separate PortGroups for easy identification and VLAN tagging. Although with a VLAN trunk option available on dvSwitch you can use a single PortGroup for multiple segments (rarely used).
- Use of VLANs – Use of VLANs is not uncommon in a vSphere infrastructure. This is the easiest method to make sure that you can run multiple network segments on a single physical wire. I have seen a very few environments who do not use VLANs & in most of such cases it is the lack of networking knowledge. Have seen servers with up to 22 NIC ports and the same number of network cables going out of the server which ofcourse is INSANE. The simple reason behind this crazy setup was no use of VLANs. Last but not the least, don’t forget to trunk the VLANs in the physical switch ports and define the same on the virtual port group. Use of VLAN1 (native VLANs) is another crime which should never be committed 🙂
- Standardize VMkernel interfaces across ESXi hosts – This one is the simplest but the most ignored best practice of all. Each VMK defined on a host should be identical to the VMK defined on other hosts in the same ESXi cluster. For e.g. if VMK0 is Management on one host, it should be Management on all the other hosts in the cluster. Same applies to all the VMkernel interfaces. This will ensure that your network configuration is easily readable. Another benefit is seen when you are writing scripts to manage network settings of your ESXi cluster. Try to follow this practice for IP addressing of management functions as well. This makes your life easy when you are troubleshooting during those unplanned downtime.
- Redundant NICs, dedicated FT Network, appropriate Load Balancing Policies & appropriate NIC failover ordering are other few configurations which can make or break the networking stack of a vSphere Infrastructure.
Read the following knowledge base article from VMware which is a great source for clearly laying out the differences between Standard & Distributed Virtual Switches – Overview of vNetwork Distributed Switch concepts.
Convergence is the way forward – Move to 10G Networks
While I say this, I would only encourage you to make this move if you have planned IT budgets, as this would be an overkill if you want to transform as in most of the cases your old hardware would be a waste. With 10G, you have options of either using the convergence with the management platform available from the hardware vendors or use features like Distributed Virtual Switches / Network I/O Control to manage the bandwidth requirements of your workload.
Here are a couple of articles which I wrote around using 10G cards for vSphere Networking. Would urge you to read them to see how things become simple with implementation of converged networks.
- Dividing Bandwidth of a 10 GB CNA Adapter for ESXi Networking and Storage using Network I/O Control (NIOC)
- Be a miser while provisioning vSwitches, Portgroups or for that matter any virtual hardware. Please remember they might appear to be FREE but they do use CPU Cycles and Memory since they are nothing but a piece of code which runs as soon as you create a new object.The idea is not to stop you from provisioning, but to make sure that you do not provision what you do not require. This also helps when you are trying to troubleshoot issues.
- vMotion has been around for a while, but the options to configure vMotion have always been evolving. Use these options to ensure that you have Fast vMotion Networks which will allow you to load balance clusters, evacuate hosts & run maintenance tasks way faster then what you have been doing. Use Multi NIC vMotion to do this. There are a number of articles on Multi NIC vMotion which you can find on frankdenneman.nl by Frank Denneman which I refer to. Go use them, they are awesome 🙂
- This is another important tip which is organizational and operational in nature. You need to involve your Networking team while choosing, designing & implementing your vSphere Infrastructure. The more the networking experts are involved, the better would be your networking stack supporting your virtual infrastructure. Also, as I see convergence in the Datacenter, I also see convergence in the IT teams. Networking & Storage teams are getting trained on vSphere and Wintel/VMware teams are getting trained on the networking and storage platforms. This change is welcoming and my suggestion to you would be follow this change ASAP so that you, your teams and your organization is able to cope with this paradigm shift.