Note: This is not a Nutanix best practice and should not be done before discussing the caveats of this with upper management.
This is a unique situation I came across in the most recent build of a Nutanix cluster. The project required the Nutanix cluster to support 2 distinct and separate networks. Since the NTNX nodes each have 2x 10Gbps and 2x 1Gbps, I needed to create 2 bonds… each bond having a single 10Gbps and a single 1Gbps.
Herein lies the issue and why this is an unsupported setup. The architecture of NTNX’s clusters leverages a Controller VM that resides on each node in the cluster. These CVM’s talk to one another and are essentially the entire brains behind the operation and what makes NTNX so great. This is why the best practice is to pair your 10Gbps links in bond0 and pair your 1Gbps links in bond1. If you do a dual-homed setup and happen to lose one of your 10Gbps links, most scenarios will see you saturating that 1Gbps link and performance will be degraded. As long as all parties understand the risks associated with a dual-homed setup of this nature, let’s get cracking.
Time to create the new bridge… SSH into one of your AHV hosts and run:
SSH into the CVM of your choice or if you’re already SSH’d into your AHV node, just run ‘ssh email@example.com’ for internal access to the CVM.
Take a look at your interfaces first:
name mode link speed
eth0 1000 True 1000
eth1 1000 True 1000
eth2 10000 True 10000
eth3 10000 True 10000
Acropolis sets up all your interfaces into a single bridge and bond by default so we will create a new bridge, new bond, and split up the interfaces into eth0/eth2 and eth1/eth3. Obviously you will have to physically run your cables to the appropriate separate networks so these interfaces might reflect differently on your end.
Modify bridge0 and bond0 to only contain the interfaces of network 1:
manage_ovs --bridge_name br0 --bond_name bond0 --interfaces eth0,eth2 update_uplinks
Create bridge1 and bond1 to only contain the interfaces of network2:
manage_ovs --bridge_name br1 --bond_name bond1 --interfaces eth1,eth3 update_uplinks
Verify everything looks good:
Uplink ports: bond1
Uplink ifaces: eth1 eth3
Uplinks ports: bond0
Uplinks ifaces: eth0 eth2
Now if that all looks good, you don’t want to have to do this manually on every CVM so use the built-in allssh command:
allssh manage_ovs --bridge_name br0 --bond_name bond0 --interfaces eth0,eth2 update_uplinks
allssh manage_ovs --bridge_name br1 --bond_name bond1 --interfaces eth1,eth3 update_uplinks
At this point, your new bridge and bond is setup to allow for access to 2 different physical networks for dual-homed NTNX love. But now you’re seeing an alert in Prism about your CVM’s using an interface which is slower than 10 Gbps. This alert will trigger all the time in this setup regardless of if the active interface is the 10Gbps or the 1Gbps but thankfully there is a way to set which interface is active. Let’s make sure that our new bonds are using the 10 Gbps.
Since we have mismatched interface speeds in our bonds, we need to use the active-backup bonding mode. If this was a traditional cluster, you could have the option of load balancing between 2 active links.
From the AHV host:
ovs_appctl bond/show bond0
---- bond0 ----
bond may use recirculation: no, Recirc-ID : -1
updelay: 0 ms
downdelay: 0 ms
active slave mac: xxxxxxxxxxxx(eth2)
slave eth0: enabled
slave eth2: enabled
First verify that you are in active-backup mode, but if you are not, you can set it by:
ovs-vsctl set port bond0 bond_mode=active-backup
Once set, we need to see which interface holds the “active slave” parameter. In my example above, that would be eth2. The active slave mac is set to eth2, as well as under the eth2 interface information, you can see “active slave.” I will admit that calling the active interface the “active slave” seems a bit counter-intuitive but alas.
In my case, the 10 Gbps port is the active slave so I am done. If you aren’t as lucky as me or perhaps you have a 10 Gbps link fail and have to manually set this back (it will not auto-repair back to 10 Gbps as of Acropolis 4.6), all you have to do is:
ovs-appctl bond/set-active-slave bond0 ethX
Repeat that for your appropriate bond# and eth# on your hosts and that’s it. Long winded post but ultimately a few commands to get the job done. If you have a large amount of hosts, feel free to do these AHV host commands via for loop in bash to automate it a bit.