In this article we'll cover the steps involved with performing a partial-site failover of your production environment to the iland Secure DRaaS environment. This can be used when an individual or group of VMs that make up an application have gone offline but the rest of the environment is still up and running.
The first important aspect to understand with partial-site failovers is how to execute and start the failover process. Different to a full-site failover where you use the Recovery groups within the iland Secure Cloud Console, a partial-site failover involves using the Veeam Console instead. Using this method allows the Veeam software to trigger a different failover process that also involves differences in how networking is handled for partial-site failovers.
Failing over individual VMs while still operating other VMs at the production site requires some L2 magic in order to ensure transparent IP connectivity between the two sites, without having to re-IP the failed VM, or make any complex switching or routing changes to your network. The good news is Veeam automatically provides this L2 magic by way of the Network Extension Appliance (NEA).
The NEA is a purpose-built lightweight appliance built into the Veeam platform for providing L2 extensions between the primary and target site networks. It provides the mechanism in which traffic generated from the production side VMs destined to the recovered VMs is intercepted and routed (proxy arp) across a secure tunnel to the DR NEA appliance. The end result is seamless transparent L2 connectivity between production and DR without the need to perform a re-IP on any VMs or switching/routing infrastructure. It also isolates this to ensure there is no interaction with the rest of the production networks and workloads.
During the partial-site failover process, Veeam will automatically power on / off the NEAs as needed. So no user intervention is required on this part. The failed over replicas will also use your source side default gateway for public network access. However, it is important to note that Veeam does not automatically shutdown your production side servers, even if they are failed over. So if you are testing this process, you may need to shut down the source servers manually to avoid any IP/Hostname conflicts. This important step is outlined further in the steps below.
There are two common methods in which you may want to test a partial-site failover:
Basic Testing - with this method you just want to check to make sure that the VM(s) can come online at the DR site and check for OS / application performance and stability, but does not require any network connectivity back to the production site. To perform this you can easily login to the iland Secure Cloud Console and gain direct console-level access to your VMs.
Extended Testing - with this method you may want to actually verify production readiness of the individual failed over VMs by bringing your production VM instance(s) offline and have full connectivity between production and the DR site for any end-user and/or application access. For this to work it is required to either shutdown the production instance(s) or you can disable the main vNIC of the production VMs that are failed over. By doing this, it will simulate a true outage of those VM instances, thus allowing the Veeam NEA to perform the proxy arp and routing of the traffic to the DR site.
Once you're completed testing, or you've recovered from the actual outage event, you can perform a failback of the individual VMs that were failed over to the iland DR site.