Protecting Horizon View Environments with Nutanix Replication

VDI is a mission critical service in most large enterprises. In the event of a disaster redeploying non-persistent instances can be as easy as enabling a new resource group. Unfortunately, that does not address any user customizations stored in “writable” volumes. While these are not always required as noted in the VMware best practices guide: “You could also consider only backing up the writable volumes very infrequently or not initially giving users their writable volumes in a DR event but instead focusing on the core components and then adding nice-to-have things like writable volumes later.” (Providing Disaster Recovery for VMware Horizon | VMware) there are many use cases where the loss of those customizations can cause a loss in productivity. Ponder, for a moment, the cost of an organization with thousands of users who each spend 30 minutes customizing settings, downloading new Outlook caches, and configuring their workspace after an outage. Due to this cost, a solution that allows users to reconnect to their writable volumes at a DR site may be needed.

Nutanix AOS includes robust replication and recovery technology, but in the GUI, this technology focuses on VMs and their attached disks. That is a great solution for traditional VMs, but in the case of Horizon View, the writable disks are not directly attached to the VM, meaning they will not be replicated along with the user’s VDI session. Instead, they are dynamically assigned as users log in. This creates a need for what Nutanix calls “VStore Protection Domains” (VStore PDs). These special constructs within Nutanix allows for configuration of a mount point within a storage container which has its own NFS namespace to be mapped to a protection domain, which will capture any object (file) that is placed within the container and replicate it to another cluster. Thus, the “writable” files will be replicated to a second cluster which will enable recovery of that data if a failover between sites is required.

Because VStore PDs are unique, they are only able to be created from within the CLI – here’s how:

Create Protection Domain

  1. From within Prism Element, create a new, empty container on the source cluster, and mount it to all the ESXi Hosts:
  1. Create a second container with the same name, but UNMOUNTED from all hosts in the destination cluster:
  1. If not already complete, setup the replication connections within Data Protection, and “+ Remote Site”. Make sure the two “Writables” containers are correctly mapped. This must be done on both clusters.
  1. Create a VSTORE PD via command line from the source Prism Element VIP:
    1. Find the name of the container to protect -> ncli vstore list
    2. Protect the vstore -> ncli vstore protect name=Writables
  1. Create a schedule for the replication from the Prism Element GUI
    1. You will not be able to select entities as everything will be protected as an NFS share within the container.
    2. Add a new schedule, and select both the source and destination for retention policy:
  1. Verify replication is successful by ensuring there are snapshots on both the source and destination clusters:

Failover

Now that the files in our writables container are replicating from the source to the destination, let’s practice a failover.

  1. Take note of the files within the writables datastore from within vCenter:
  1. Unmount the writables container from the source cluster:
  1. Activate the destination Protection Domain from the CLI of the destination cluster:
    1. List the protection domains -> ncli pd list
    2. Active the protection domain -> ncli pd activate name=Writables_nnnnnnnnnnnnn
  1. Mount the Writables container on all hosts from the destination cluster:
  1. Verify that the Writables container is mounted and the files that were replicated are accessible:

Now that the Writables container is visible to the remote VMware environment, with the same datastore name and files, you can recover your Horizon View environment.

Cleanup

To clean up your destination cluster, run the following from the cli of the destination cluster:

  1. Get the name of the protection domain to be cleaned up -> Ncli pd list
  2. Deactivate the PD and clean up files at the destination site -> Ncli pd deactivate_and_destroy_vms name=Writables_1701976519425
    1. NOTE – THIS IS A HIDDEN COMMAND THAT WILL DESTROY ANY VMS CURRENTLY RUNNING IN THE DESTINATION SITE ASSOCIATED WITH THE PROTECTION DOMAIN. USE AT YOUR OWN RISK.

Now that the destination cluster has been deactivated, replication from the source cluster will resume as expected. One thing to note, the deactivation command does not delete any snapshot data so the replications will continue to be delta only.

As you can see, there is a bit of movement between the GUI and CLI prompts to accomplish protecting and recovering Horizon View writable volumes, but the process is relatively simple and in just a few minutes you can ensure that your users are not wasting time reconfiguring their customizations and instead, being productive sooner after a failover.

Resources:

PD-Based DR 6.7 – VStore Data Protection (nutanix.com)

AOS 6.7 – Nutanix Command-Line Interface Reference

PD-Based DR 6.7 – Configuring Data Protection with NearSync Replication (20-59 seconds RPO) for VStore Protection Domains (nutanix.com)

Nutanix VDI Example Architecture for 20K to 200K+ Power User Desktops | Long White Virtual Cloudsu by (longwhiteclouds.com)