Get Adobe Flash player

FTF Veeam v6 Replication – Veeam Architecture

When Veeam introduced the new distributed architecture with v6.0 there were significant improvements to the replication engine in both speed and efficiency. There were also significant changes to the Veeam architecture that enabled the improvements. In my experience, many customers have misunderstood the deployment requirments and the various configuration options when first attempting new v6 jobs. The trouble is due partly to upgrades of in place Veeam v5.x jobs, but also the fact that the out-of-the-box automatic proxy selection settings would sometimes cause less than optimal results when configured for multiple site replication.

This post explains the new proxy architecture and requirements at a high level and offers some tips on making sure things work right. There are a few screen shots from the Replication and Proxy wizards, but If you need more technical details on Veeam Proxies and the actual features or job configs please be sure to check the Veeam User Guide and the Veeam Forum F.A.Q (linked below).

This post is part of the VMETC.com From the Field Series on Veeam v6 Replication

Proxy Primer

The easiest way to understand the Veeam v6 Proxy architecture is to realize the data flow of a Veeam replication job. That is explained very simply in the Veeam Forums Backup and Replication F.A.Q:

READ THIS FIRST : [FAQ] FREQUENTLY ASKED QUESTIONS

Architecture

Q: What is the data flow in case of replication?

A: Disk > Source proxy > Network > Target proxy > Disk

Q: Can I use the same source and target proxy for replication?

A: Yes, but only when replicating locally (on-site replication).

Danger Will Robinson!” Notice the last Q&A? You will need at least one Veeam Proxy at both the Source and Target sites if they are different locations. You may need to add more than one Veeam Proxy per site. More on why later in this post.

The basic Veeam replication architecture is illustrated in the following whiteboard diagram:

Architecture Overview

  • Install Veeam at the target location for replication jobs. Installing Veeam also creates the target proxy.
  • Use the Veeam console to create a Proxy at the source location.
  • A Veeam Proxy is any supported Windows OS (virtual or physical). Linux servers are not an option.

Note that I am dicussing replication only in this post. Many customers do install a separate instance of Veeam at the source site for nightly backups. In that scenario both Veeam installs are independent and do not share jobs or configs. Veeam Enterprise Manager (not illustrated in the whiteboard) then consolidates both installs in a single pain of glass for monitoring and managing all jobs at both sites.

Proxies – where and why

Why put a separate proxy at each location? As already established, replication data flow moves from source disk to target disk through the proxies (no longer through ESX management nics!), but there is more than just the data path to understand.

The source proxy will read the VM blocks from the local datastores, compress and dedupe the VMs, and then send it across the WAN in the compressed form. The target proxy will then re-hydrate the VM as it writes it to the local datastores at the target. As you can see, each proxy not only moves data but performs several other tasks – read, write, manage, compress, inflate, etc. The correct selection of the proxies local to each site is critical to optimized and efficient replication.

Here are two commonly made mistakes.

  1. Veeam installed at source site only – This means use of a single proxy (the default Veeam Server install). VM blocks will be read and compressed but then inflated again before it ever leaves the source site. The replica VM will be written to the target datastore across the WAN uncompressed.
  2. Veeam installed at target site only – Again, the default install means a single Proxy. VM blocks will be read and compressed at the target site – remotely. Data traverses the WAN before being compressed. After reaching the target proxy data is compressed (needlessly) and inflated again, and then written to the target datastore.

The above two scenarios are oversimplified. With both there could also be a lot of traffic “ping pong” bounced back and forth between sites as the Veeam Console (GUI) attempts to manage and send commands between the proxies.

Pick your Proxies Manually

After deploying a Veeam Proxy at both locations, I recommend manually configuring specific proxies for each job. Do this on the Job Settings screen in the Data transfer section.

Configure Proxies for Job

In the screen shot above you can see the fields for the Source proxy and the Target proxy. Clicking the “Choose” button allows you to tick checkboxes and select from all your configured proxies. Choosing multiple proxies for each job is possible for redundancy.

Be careful with Automatic Proxy Selection

The default data transfer settings is “automatic”. This means that Veeam will use it's own A.I. to pick proxies for you. This is done by watching the availability and load on every configured proxy. This can be a bad thing, however. As proxies take on jobs and reach their max concurrent tasks (more later in this post), Veeam mght pick a remote proxy to handle local tasks just because it is the only proxy available for the “automatic” selection. Personally, I would rather a job queue and wait for the right proxies to become available as opposed to introducing uncompressed replication or management “ping pong” traffic on the WAN.

Repository for Replication

In the replication Job Settings screen shot above, the Repository field is below the Source and Target Proxy fields. Many customers confuse this setting with Veeam's Backup Repositories. The same terminology leads one to believe their target disk for backup jobs must be configured as part of a replication job. Add to that the repository for replication is created exactly the same as a repository for backup. Despite the similarity, a replication repository most often should be a new and separate server at the source site. A recent Forum thread clearly explains why:

It's basically just a place to store the hashes for deduplication during replication. Since dedupe is performed by the source proxy it's best to keep it on the source side, otherwise the metadata has to be read/written across the WAN by the source proxy.

If the metadata is missing, Veeam will simply recreate it by rescanning the target side and recalculating the hashes, but it's only needed during a replication, not for failover.

A replication job repository is significantly smaller (a few gbs in size so no extra space is needed) than it's backup cousin. When in doubt use the source site Proxy (add it as a Repository in Veeam too). Recalculating the hashes can take some time, so try to pick a server that will be consistantly available at the source site.

DNS and Permissions

A Veeam replication job to a DR site spans multiple datacenters. That means there is often different security, resolution, and permissions. – both in the virtual infrstructure and the directory services of the guest VMs. Looking at the whiteboard from earlier in this post will help you visualize this design challenge.

I have found that the following two Veeam KB Articles are frequently needed by customers setting up new replication jobs:

KB1518: Ports needed for use with Veeam backup and replication 6

KB1198: NFC server connection is unavailable: troubleshooting steps

Where to install the Veeam Server?
I mentioned earlier that for replication jobs a Veeam server should be installed at the target site. Although with v6 Veeam Server placement is not an architecture requirement, ensuring that the Veeam server is still around at the time of disaster gives a customer more fail over / recovery options and functionality.
Some awesome reasons to put Veeam at the DR site are:
Re-IP VMs during failover
Failover testing

Permanent failover

VM Failback

Note that the placement of the Veeam Console ( install ) is no longer critical to the actual performance of the replication job. Since Proxies now handle the actual flow of data between sites, Veeam simply sends commands to start the job and then collects the statistics and results. Should the Veeam server be lost during a DR event, all replica VMs can still be started manually via the hypervisor's native management client.

Concurrent Jobs (or Tasks)

The Add Proxy Wizard initially allows you to configure the Max concurrent tasks. You can right click a previosuly configured Proxy and change the Properties later.

New Proxy Wizard - Proxy Settings

The Proxy rule of thumb is this – For every 2 CPUs on a Proxy Veeam can send it 1 job (task).

In the screen shot above, this proxy will only handle one job at a time. I borrowed the image from Rick Vanover's post How to install a Veeam Proxy (vSphere). He does a great job explaining the process there. Vanover also has a walkthrough post on Veeam's blog about how to add Veeam Proxies.

Bottom line is that for all Veeam proxies, both at the source and target sites, give your proxies the most amount of CPUs you can afford to. That allows you to make 80-20 VM adjustments, create more jobs to run simultaneously, and get the most done in your replication window.

More about Proxy concurrent tasks can be foud in the online Veeam Help – Veeam Backup & Replication – Limiting the Number of Concurrent Tasks.

Summary

Veeam replication has many options to meet any VMware or Hyper-V infrastructure demands. Granular feature configs and a flexible architecture of Veeam Proxies create a disaster recovery solution that can scale to any size as needed. With a deeper understanding of all the options available, an admin can make smart changes to the deafult Veeam install that allow for fast and efficient replicas in the job window desired.

 

Related Posts

  • Pingback: From The Field Series – Veeam v6 Replication | VMETC.com

  • cby

    Excellent article. I encountered all the points you mentioned and addressed them in the same fashion, proxies in particular.

    One area you don’t mention that caused me grief was the comms between the DR site proxy and the main Veeam backup server, and between the proxy and the ESXi host at the DR site.

    For reasons outside of this discussion we have multiple IP addresses assigned to the backup server NIC connecting over a WAN link to the proxy. To complicate matters further we have several levels of NAT rules in place. By trawling through the Veeam logs after failed attempts to replicate I discovered that the Veeam backup server publishes all its IP addresses to the proxy when initiating the replication process. The proxy in turn attempts to connect to each IP until it establishes a connection to the Veeam server. None of the published Veeam server’s local addresses were visible to the proxy because of NAt’ing. By assigning the NAT’d IP address to the Veeam server, as seen by the proxy, we were able to overcome this issue. The NAT’d IP now appears in the previously mentioned list of published IPs.

    The workaround for the ESXi comms was to define the ESXi hostname with its (locally) associated IP address on both the Veeam server and the proxy. So now when the proxy resolves the host name it sees the local IP as opposed to the IP associated with the host on the Veeam server. It’s not ideal but does the job well.

    The multiple NAT’ing rules are what really threw it especially with the replication running over a secure .gov n/work.

    Regards
    cby

    • http://vmetc.com rbrambley

      cby,

      Thanks!

      I covered the scenario you ran into, admittedly very briefly, in the DNS and Permissions section. Or at least that was my intent. Maybe I should have titled that section DNS, Permissions, and Networking! Oh My! :) . The reality is that you have some complex NAT-ing that would have impacted any application architecture stretched across multiple sites and networks. Thanks for sharing the issue and your resolution here. It is a great example of some of the environmental factors that must be accounted for in a Veeam replication solution.

Badges

follow-me-twitter

I blog with Blogsy

Comments / DISQUS