Setup dual-home Nutanix cluster on Acropolis 4.6

Note: This is not a Nutanix best practice and should not be done before discussing the caveats of this with upper management.

This is a unique situation I came across in the most recent build of a Nutanix cluster. The project required the Nutanix cluster to support 2 distinct and separate networks. Since the NTNX nodes each have 2x 10Gbps and 2x 1Gbps, I needed to create 2 bonds… each bond having a single 10Gbps and a single 1Gbps.

Herein lies the issue and why this is an unsupported setup. The architecture of NTNX’s clusters leverages a Controller VM that resides on each node in the cluster. These CVM’s talk to one another and are essentially the entire brains behind the operation and what makes NTNX so great. This is why the best practice is to pair your 10Gbps links in bond0 and pair your 1Gbps links in bond1. If you do a dual-homed setup and happen to lose one of your 10Gbps links, most scenarios will see you saturating that 1Gbps link and performance will be degraded. As long as all parties understand the risks associated with a dual-homed setup of this nature, let’s get cracking.

Time to create the new bridge… SSH into one of your AHV hosts and run:

SSH into the CVM of your choice or if you’re already SSH’d into your AHV node, just run ‘ssh nutanix@192.168.5.254’ for internal access to the CVM.

Take a look at your interfaces first:

Acropolis sets up all your interfaces into a single bridge and bond by default so we will create a new bridge, new bond, and split up the interfaces into eth0/eth2 and eth1/eth3. Obviously you will have to physically run your cables to the appropriate separate networks so these interfaces might reflect differently on your end.

Modify bridge0 and bond0 to only contain the interfaces of network 1:

Create bridge1 and bond1 to only contain the interfaces of network2:

Verify everything looks good:

Now if that all looks good, you don’t want to have to do this manually on every CVM so use the built-in allssh command:

At this point, your new bridge and bond is setup to allow for access to 2 different physical networks for dual-homed NTNX love. But now you’re seeing an alert in Prism about your CVM’s using an interface which is slower than 10 Gbps. This alert will trigger all the time in this setup regardless of if the active interface is the 10Gbps or the 1Gbps but thankfully there is a way to set which interface is active. Let’s make sure that our new bonds are using the 10 Gbps.

Since we have mismatched interface speeds in our bonds, we need to use the active-backup bonding mode. If this was a traditional cluster, you could have the option of load balancing between 2 active links.

From the AHV host:

First verify that you are in active-backup mode, but if you are not, you can set it by:

Once set, we need to see which interface holds the “active slave” parameter. In my example above, that would be eth2. The active slave mac is set to eth2, as well as under the eth2 interface information, you can see “active slave.” I will admit that calling the active interface the “active slave” seems a bit counter-intuitive but alas.

In my case, the 10 Gbps port is the active slave so I am done. If you aren’t as lucky as me or perhaps you have a 10 Gbps link fail and have to manually set this back (it will not auto-repair back to 10 Gbps as of Acropolis 4.6), all you have to do is:

Repeat that for your appropriate bond# and eth# on your hosts and that’s it. Long winded post but ultimately a few commands to get the job done. If you have a large amount of hosts, feel free to do these AHV host commands via for loop in bash to automate it a bit.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.