Considerations for Implementing Fail Over VI at a Secondary Site
These are my notes I used to prepare for a discussion with a client about implementing a secondary site for DR fail over. The client has already virtualized their production data center and is wanting to leverage VI for DR. The point of my discussion is that VI is too often viewed as a “silver bullet” for tough projects like back up and fail over. Yes, there are some specific areas that are easier to implement with VI, but careful consideration and planning must be executed if the overall DR plan is to be successful.
Goals and Objectives – the customer must make important decisions first !
· Recovery Time Objectives – acceptable time to start up systems and allow user access
requires server by server analysis
· Recovery Point Objectives – acceptable point in time recovery or start up at secondary site
requires application by application analysis
· Mission Critical Services
which applications & services must be available first.
Pro active preparation – The following basic networking must pre exist at the secondary site to allow fail over
· Build the physical secondary network – it’s obvious, but often not planned.
subnets / vlans / routing must be in place.
AD / DHCP / DNS must be in place (can be VMs or extended infrastructure of primary site)
VPN and remote access methods must be in place
Firewall / Security
Bandwidth and latency between primary and secondary sites must be examined & optimized.
This will impact or limit what you can do in terms of replication to the DR site
· Build virtual infrastructure – again, it’s obvious but often not planned.
Configure VI hosts for VM start up
Determine hardware for hosts and storage
create resource pools, configure vmotion, DRS and HA (ensure best possible performance and availability)
configure virtual networking
· Configure backup / restore process
Tape or disk library availability at secondary site
Secondary Management / Media server for restores during recovery as well as ongoing backups
during period DR site is in use.
Both traditional storage agent solutions as well as live, full VM backup.
Server Infrastructure replication
· Replicate encapsulated files of VMs from primary site to secondary site via (choices):
Nightly full VM backup at primary site and daily full VM restore at secondary site
Use host based replication (VM OS is host not VI host server) between production
VMs and secondary site VMs. (must prebuild secondary VMs to implement)
real time SAN based replication (via WAN)
· Periodically execute system to test start up of replicated VMs isolated from primary site
· Non VI system availability – physical infrastructure – what about the servers that are not virtual?
Clustering
Bare metal restore
Networking & Operations changes
· Once a disaster occurs the following networking and infrastructure changes must be handled
· Change of ip addresses of all secondary VMs
· dynamic DNS or manual process
· External networking changes
· MX record (email)
· DNS host records (web, intranet, applications)
· VPN / Remote Access
· Internal networking and security changes
· activate DHCP scope for clients
· Active Directory FSMO roles, DC replication
· DNS zone transfers
· Host records
· log on scripts – file and print
· user and group permissions
· Telephony
· Point to point / branch office connectivity
Users – How do you quickly get your company working again?
· Remote access for different user types (different applications / permissions)
developers
administrators
business unit technical contacts
normal workers
· VPN
remote offices
mobile users
home office
· Published applications
SSL VPN
Citrix
· VDI – a new concept for DR failover? How to provide applications and services for “stranded” users
preconfigured VM desktops for users
pool of similar desktops assigned to user groups with pre-installed applications
administrator desktops and applications
· Temporary / relocation physical access
temporary facility for users (for example hotel conference room)
Fail Back! – What, did you think you would just use the DR site permanently?











