Get Adobe Flash player

Why Does Cloning A VM From Template Take A Long Time?

Over the past few years I’ve been asked to troubleshoot and explain why cloning a virtual machine (VM) from a master template would take a longer time than expected more than once. Usually when I’m asked the virtualization admin is frustrated at the hypervisor. “This shouldn’t take this long. It needs to be fixed!” they say. “I definitely agree,” I say, “but let’s take a deeper look at what is happening here first before we flame the vendor’s help desk technician on the phone.”

So, this post is about taking a deeper look at where the master template VM resides versus where the cloned template is destined. My math my be a little off or may not account for every factor involved, but my point is to be close enough to demonstrate that the disk/array/LUN design is can be the culprit more times than not.

When I started this post I emailed for some help. I asked for a sanity check from some storage experts. I’ve been reasonably happy with my own answer until now, but I figured I do some research before adding the content to VM /ETC. I got back a single reply that I am paraphrasing: “Sounds about right. Let me think about it some more and if I can stump you with anything else I’ll let you know.” He never did so I’ll take that as a positive confirmation meaning “yes VM moron, it is that simple.” Good enough for me! If anyone can point out any other factors I am not properly accounting for please leave a comment.

The following is part of my email for help. It not only explains my test scenario but it illustrates the problem and resolution as well. At the end of this post I make some suggestions for bettering the time it takes to clone a VM.

The email for help

Oh wise and all powerful masters of the disk,

I humbly submit the following concept for your review. Guide me to a greater disk performance understanding when cloning a VM in VMware ESX environment.

Here’s the scenario:

  • Cloning a VM takes a long time – 10 GB VM using only 3.5 GB of space takes roughly 45 min to an hour to clone.
  • The master template and the clone reside on the same disk and NFS mount.
  • Yeah, it’s a single SATA disk in a lab. I know, it should suck.

I’m trying to explain the expected speed of read and writes using the IOPs calculator here: http://www.wmarow.com/strcalc/

See the attached screen shot for the values I put in the calculator, but the results I’m interested in are:

  • with 50% reads and 50% writes (master and clone on same disk) average throughput (MB/s) is 1.2
  • I used 50% reads and 50% writes for the cache.

which means to me that

  • 3548 MB / 1.2 (MB/s) = 2957 secs or 0.82 hrs.

Here’s the screen shot of the IOPS Calculator I linked in the email for help:

disk array calculator Capture 

Suggestions for improvement

Obviously, the type/performance of the disks, the number of disks, and the type of array makes a huge difference. I should also point out that I am using 8 ms as the value for the seek latency. I’m not as focused on technical accuracy because my point is served without it, but changing this value makes a significant difference as well. If you want technical accuracy and more explanation about some of the numbers to use in the calculator check out these posts on the topic of IOPS and the impact on a virtual environment:

In my case, moving the VM template to another disk/array or increasing the number of disks used on my NFS server would help because the reads and writes would be separated when the cloned VM resides on a different disk/array and the number of IOPs possible would be increased with more disks. Yes, this post uses a single SATA disk as a simple example, but the point is hopefully clear. Use the same logic and math for shared storage scenarios, all storage protocols, any vendor’s storage device, and all RAID types. Plug those values in the IOPs calculator to calculate your own results.

My ultimate point is to make everyone think about how the disk/array/LUN design decisions impact the behaviors of the virtual infrastructure.

As an example, if my lab NFS server was using 6 SATA disks configured as a RAID 5 array the calculation for expected time to clone changes as follows:

  • 3548 MB / 2.99 (MB/s) = 1187 secs or 0.33 hrs.

Better, right? Hey, it’s a basement lab. It’s supposed to suck!

Related Posts

  • matt

    I found out the hard way that doing something as simple as a 'dd' could run glacially slow if the blocksize parameter was set too high to overrun the available mem of the service console..

    Using the Datastore browser to copy/move files around works the best. That's how I clone things. Then SSH into 'esxi' and use 'mv' and 'vi' to alter the .vmx if needed. I suspect the 'clone from template' operation is doing something really dumb.

  • http://vmetc.com rbrambley

    Matt,

    I assume the block size you are referring to is the same block size that limits the size of a .vmdk on a VMFS volume? I'm confused because even the smallest size (and default setting) of 256GB is higher than the 800 mb max of the Console, right? I'm probably missing something obvious here but how does this impact an operation like dd or a clone from template? Thanks.

  • http://vmetc.com rbrambley

    Matt,

    I assume the block size you are referring to is the same block size that limits the size of a .vmdk on a VMFS volume? I'm confused because even the smallest size (and default setting) of 256GB is higher than the 800 mb max of the Console, right? I'm probably missing something obvious here but how does this impact an operation like dd or a clone from template? Thanks.

  • jerelh

    Actually, he's referring to the block size option in the dd command, i.e. bs=1024. This tells the dd command to read and write 1024 bytes at a time from your input file to your output file. dd is a fabulous way to get an ISO file from a CD/DVD to a disk using dd if=/dev/cdrom of=nameoffile.iso bs=1024/2048 or whatever block size you want. Matt is basically saying that if the block size of the file is set to any value over the amount of service console RAM that it just won't really work right. I've never cranked up the block size to over 2048 which is only a 2k block so I'm not real sure how horrible it is. Block size is 512 bytes by default.

  • http://vmetc.com rbrambley

    Thanks! That makes a lot more sense now. Sure, so even if you have the SC
    mem size set to the 800 mb max you still do not have the block size covered
    for dd if using 1024.

  • jerelh

    Right. That block size would have to be enormous to maximize even the default SC memory. I'm not real sure why you would want to do that though. I would think a 400 mb block would be way more than you'd want to read in from a device that is dependent upon the amount of scratches/thumbprints on the surface of the media. Maybe there's a reason for it and I haven't been filled in yet…

  • matt

    If I'm copying VMDK's I'm not inclined to use 2k block size. More like 1GB. Basically the bsize has to be rather less than the free RAM or the copy comes to a screeching halt. But anyhow I pointed it out because it bit me hard. I don't know what blocksize 'vifs' and friends use. I just found it odd that I could do a manual clone and launch faster than VCenter could.

  • jerelh

    Oh, I see. So you were using dd to copy vmdk files rather than ISO's. Makes sense. I've never tried that to see how fast it was compared to cloning. Interesting that it's that much faster. I'm curious to know what the actual clone process uses for it's underlying technology to copy the data.

  • Guest

    FYI: It is much quicker to use 'vmkfstools' to copy (clone) a .vmdk file than using 'cp'

  • Guest

    FYI: It is much quicker to use 'vmkfstools' to copy (clone) a .vmdk file than using 'cp'

Badges

follow-me-twitter

I blog with Blogsy

Comments / DISQUS