Badges

gestaltitbadge

follow-me-twitter

Subscribe to me on FriendFeed

Comments / DISQUS
Feedjit.com

Simply Automating Virtual Machine IP Addressing For Disaster Recovery Sites (without scripting)

If you are looking at various options to automate virtual machine (VM) ip address reconfiguration when failing over virtual machines to a disaster recovery (DR) site, this post explains an option so simple it is beautiful. To give full credit, the Vizioncore vReplicator 2.5 Best Practices document enlightened me to the strategy of using a local only VMware vSwitch and an extra virtual NIC (vNIC) in each VM. It’s been a long time since I had a “ton of bricks” moment, but this concept crashed down on me with the realization of a configuration that works in any version of ESX, doesn’t require extra software or hardware, and better yet, doesn’t have to be scripted! Just configure some extra virtual networking and forget about it!

Here is a general outline for automating the DR ip addressing with this method:

At the Primary Site

  • For these instructions assume the production vSwitch at the primary site has a Portgroup named VM Network
  • Build a new vSwitch and do not attach any physical NICs (local only isolated switch). Create a Portgroup named DR Network
  • For each VM you need to fail over to a DR site, add an extra vNIC and attach it to the DR Network Portgroup

At the DR Site

  • Create your DR site production vSwitch, attach physical NICs and add a Portgroup named DR Network.
  • Create another vSwitch and do not attach any physical NICs (local only isolated switch). Create a Portgroup named VM Network

All you have to do for this to work is

configure the extra vNIC in each VM at the primary site with the correct VLAN and ip address of the DR site production network. This ip address configuration will be cloned in the replicated DR VM, and when it’s time to power on the VM at the DR site both vNICs will connect to their respective virtual networks. The DR Network Portgroup will now be on the vSwitch with the physical connectivity to the DR production network and the VM Network vNIC will be isolated to a local only vSwitch.

Vizioncore deserves the credit for showing me the idea, but there’s no reason why this won’t work with any VM replication or VM backup product. It’s not automated DR fail over workflow like VMware SRM, but using this extra virtual networking strategy eliminates some manual or even scripted ip re-addressing.

Simple!

Related Posts

  • http://professionalvmware.com professionalvmware

    Great post. The simple solution is often the most elegant.

  • michael

    What about drs (at fully automated level)? I think it will be affected, because trying to vmotion vm connected to intranet portgroup throws a warning.

  • http://twitter.com/dpironet Didier Pironet

    For Windows 2003 guests, add a secondary IP address and default GW for the DR network.

    For Windows 2008 and up, you use the Alternate Configuration (TCP/IP Properties) to set a complete different network configuration.

    All VM in DR scope have 2 IPs for the 2 networks, Prod and DR. no need to create anything in vCenter. Even much simpler ;)

  • http://viewyonder.com/ Steve Chambers

    IN fact, my Cisco colleague told me about Local Area Mobility which is a way to allow foreign hosts on a network (ie. your DR VMs) to come up with no changes to their IP stack and the local routers (in your DR site) just add host routes for them. This isn’t a best practice or hugely scalable, but it works:

    http://www.cisco.com/en/US/products/ps9390/products_white_paper09186a00800a3ca5.shtml

  • http://viewyonder.com/ Steve Chambers

    There's quite a few missing bits, isn't there? What about if an app on the VM tries to communicate with another VM with a direct IP address – the IP addresses are different in the DR site, so that won't work.

    Also, if you have two NICs then which NIC is used to route to a completely different network (e.g. http://www.google.com)? In this case, because the IP for google.com is on a different network the IP stack needs to know which NIC to route that request out of. Normally the default gateway is used – but in this case you need to pick the correct vSwitch vNIC depending on which site you are in – so how does this work?

    It's the routing I don't quite understand – perhaps you have some screen shots of the ipconfig + route commands in the prod and dr vms?

    Thanks!
    Steve

  • http://vmetc.com rbrambley

    Didier,

    Brilliant! (I am raising my virtual Guinness to your idea)

  • http://vmetc.com rbrambley

    Steve,

    Great points. It's not a solution that eliminates all configuration changes, but to that point what solution is? Not even VMware SRM stands up to the routing measuring stick. Maybe the only true all encompassing solution is to configure a physical networking bridge between the DR site and the primary site so they are both on the same ip subnet. Then the ip addresses won't have to change. Can any of the physical networking config be pre staged to account for some of these routing challenges?

    I just discovered this strategy yesterday so I don't have experience or screen shots with/of the implementation. I just threw the concept in a post to kick it around just like we are doing now! So thanks for making me think!

    Some replies to your comments:

    What about if an app on the VM tries to communicate with another VM with a direct IP address – the IP addresses are different in the DR site, so that won't work.

    You're right, but those apps won't work with SRM, any other ip change automation, and even if you manually change them.You could change the app to look for a DR ip too or better yet a hostname. I know, easier said then done.

    if you have two NICs then which NIC is used to route to a completely different network (e.g. http://www.google.com)? In this case, because the IP for google.com is on a different network the IP stack needs to know which NIC to route that request out of. Normally the default gateway is used – but in this case you need to pick the correct vSwitch vNIC depending on which site you are in – so how does this work?

    Again, great point. I am assuming that if a gateway exists on both interfaces and there is routing available either gateway will work. That may cause a waste of cycles / effort as the VM figures out which way to go when it is resurrected at a new site, but it serves the main point which is get the VM online.

    The major benefit is that last point. If the VM is online then the administrator responsible for the app can make the changes via RDP, web interface, remote console, etc. There is still a DR workflow to understand and complete, but the virtualization admin doesn't shoulder as much of the responsibility.

  • http://vmetc.com rbrambley

    Michael,

    As long as the exact portgroup config and naming exists across ESX hosts VMotion should work. Make sure your vSwitch naming / numbering is identical too just in case – OK, maybe just because I like being consistent.

    As far as throwing a warning, I guess that prevents fully automated DRS.

    Like everything else it's a trade off to understand. What benefit is most important for each VM? if fully auto DRS is a deal breaker then a different solution needs to be considered.

  • http://viewyonder.com/ Steve Chambers

    I'm assuming you have one default gateway that points all traffic for unknown networks to an IP on the known network that can route (ie. your router).

    If you have two active NICs that have different network addresses, you still normally only have one default gateway plus any static routes.

    For your solution to work you need more than this, probably multiple gateways with metrics.

    See what I mean? There is more to this than meets the eye!

    :-)

  • http://viewyonder.com/ Steve Chambers

    After speaking with someone far brighter than me @ Cisco (and there are many!), the short version of the long answer and a myriad of options is: you will have to reconfigure the default gateway in the OS when it comes up in the DR site.

  • roadfox

    It might work if you could “switch off” the switch after all is configured. So on Primary Site the DR Switch would be off, and on the DR Site the Primary is off. This way the links of the according interfaces would be down, and no default gateway routing would happen over the “link down” interfaces.

    Your next problem is then DNS resolution, if you are using hostnames and not direct ip's.

  • Pingback: Tweets that mention Simply Automating Virtual Machine IP Addressing For Disaster Recovery Sites (without scripting) | VM /ETC -- Topsy.com

  • Dracolith

    This is possibly where something like Cisco DCI comes in:
    http://www.cisco.com/en/US/prod/collateral/swit

    I believe VMware validated some L2 bridging scenarios for long-istance VMotion in VSphere, some time ago, under certain latency requirements..
    Perhaps long range VMotion under those scenarios could be the all-encompassing DR solution, if you can [one way or another] solve the problem of seamlessly switching over storage to the VM's NFS mount on the DR NAS…..

    Otherwise L2 bridging might be _too much_ of a connection between DR and main site.
    better use something like Datacenter Bridging, or a L2 bridging loop / broadcast storm could kill both primary and secondary sites….

    There's the question of what happens if you need a partial failover?

    Better have a fair amount of bandwidth between the two sites, with redundant connections, for inter-VM traffic…

  • http://twitter.com/dpironet Didier Pironet

    And for that I'm using a script that pushes out to all VMs up in DR the new values for DNS, GW and eventually WINS, Search Order and so on.
    Here is a sample script for whom might be interested:
    $lines = $(get-content “c:DRServersNewDNS.in”)
    $backendDNS = “xxx.xxx.xxx.xxx”,”yyy.yyy.yyy.yyy”
    $frontendDNS = “zzz.zzz.zzz.zzz”
    foreach ($line in $lines) {
    $parts = $line.Split(“,”,[StringSplitOptions]::RemoveEmptyEntries)
    $strComputer = $parts[0]
    $strComputer = $strComputer.TrimEnd()
    $colItems = get-wmiobject -class “Win32_NetworkAdapterConfiguration” `
    -computername $strComputer | Where{$_.IpEnabled -Match “True”}
    foreach ($objItem in $colItems) {
    if ($objItem.IPAddress -match “^xyz.abc”){
    $objItem.SetDNSServerSearchOrder($backendDNS)
    }else{
    $objItem.SetDNSServerSearchOrder($frontendDNS)
    }
    }
    }

  • Andy Knight

    why don't you use DHCP on both sites with static reservations for all VM's therefore no matter which site you rVM runs on it always comes up with the right Ip details corresponding to the site it is on.

  • http://vmetc.com rbrambley

    Andy,

    Thanks! That sounds like another great alternative for simplifying the
    switch to a DR site.

  • vCarter

    We just finished trying this theory out. We perform 2 offsite DR exercises each year, for different divisions of the company. We have been replicating our vm's, but wanted to simplify even more by using this method. I am pleased to say that this process worked like a charm!

    Other than “working as designed”, the only issue that we found was that from the primary site, a few of our vm's registered the DR IP address in our primary DNS, which obviously caused problems. We had hoped that changing the binding order of the NICS would resolve this, but it did not. The work-around was to uncheck the default “register in DNS” within the DR virtual NIC advanced properties.

    Our process at the secondary site was to turn off the public/production NIC before we powered on the replica, so there would be no conflicts. Just turned the replica on, registered the DR IP, and all worked great!

    Thanks again for this suggestion, it is now a part of our documented process.

  • Nestor Urquiza

    Hi,

    Wouldn’t be simpler just to replicate VMs as they are (same Ip/hostname) in DR?

    IMO DR should be a dark environment and it should be only turned on when Production is not running.

    So I am wondering if someone has experienced with this: SAN to SAN replication to DR and zero reconfiguration to get DR up and running.

    Thanks,
    -Nestor

Get My Podcast On iTunes!
Support VM /ETC
Support VMETC.com

Support VMETC.com

Free Business and Tech Magazines and eBooks
@rbrambley tweets
VMTN Roundtable Podcasts
Subscribe



Add to Google Reader or Homepage
Subscribe in NewsGator Online
Add to netvibes
Add to Plusmo