DRS and Power Management under the hood of my Prius

Posted on May 8th, 2008 in drs, vmetc.com by Rich

I ended up with a Toyota Prius as my rental car for the week in San Diego. I’ve never driven a Prius before, and honestly, I’ve never really had an interest in the car until now. Like most, I knew that the Prius uses a Hybrid Synergy Drive (HSD) engine, but I had no idea about all the cool technology built into making the car so efficient. As a matter of fact, the Prius engine technology is in some ways similar to the Distributed Resource Scheduling and Power Management features of VI3 Enterprise.

According to wikipedia’s page about the HSD:

Designing ESX Resource Pools

Posted on March 4th, 2008 in cluster, drs, esx, esx 3i, esx3.5, fail over, how to, services, vc2, vc2.5, vi3, vmetc.com by Rich

How do you design resource pools in an ESX Cluster? There are two strategies that are the most popular in my experience. The first strategy creates resource pools based on CPU and Memory shares for host resource conflict management, and the second strategy uses reservations and limits to guarantee physical resources and ensure VM containment. This post will use a 3 ESX host example to explain both strategies. Please feel free to comment on the pros and cons of each or why you think one is better than the other.

In the example scenario three ESX hosts each have 16 GB RAM and 2 dual core 3.0 Ghz CPUs. The three hosts will all be members of the same ESX cluster.

The process vmmemctl went crazy and made the machine unusable

Posted on November 12th, 2007 in drs, esx, how to, kswapd, vmmemctl, vmotion, vmware by Rich

I got an email today about a problem I had not seen in a while. This company was still using local storage only and has not migrated to shared storage. So, unfortunately they have not been able to leverage DRS yet!

Here’s a cut and paste from my customer’s original email.

The process vmmemctl went crazy today for 30 seconds or so and made the machine unusable; after that, kswapd went nuts for about 30 seconds. Then things were back to normal. What’s up with that stuff? It seems every VMware virtual machine we’ve seen these kinds of problems on. They’re pretty annoying on a development machine, and really problematic on a production machine.

 

Here’s my reply:

It’s been a while since I’ve seen that! This problem used to occur more often in ESX 2.X days before VI3 - before shared storage, vmotion, and DRS became the norm. Back then this always surfaced when an ESX host’s physical resources were over committed.

The reason is because your ESX servers guest VMs are battling over RAM, and how ESX manages that (without DRS in VI3 Enterprise) is to write out the RAM to a balloon driver on the VMFS LUN. Unfortunately that process zaps the VM(s) and spikes the ESX CPUs.

Here’s some quick links for more about this:

http://communities.vmware.com/thread/55488

http://communities.vmware.com/message/769479#769479

http://www.vmware.com/pdf/vi3_esx_resource_mgmt.pdf ( ! ! check out page 132 for vmmemctl info )

You can try to work around this by reserving RAM for each VM to 50% of the assigned VM memory. For example, if your VM has 1GB ram then create a memory reservation for at least 512 MB RAM. That was done by default back in ESX 2.X, but it is no longer done with VI3. This will quickly limit how many VMs you can host though! Maybe start with the VMs that seem to be affected the most? I would also look closely at all of your VMs and scale back virtual RAM where possible - do all of your VMs really need all the RAM they were created with?

Of course you can always add other ESX servers and spread the VMs out across more hosts. Finally, once you get to shared storage then DRS will auto manage contention for you by auto vmotion, but you will probably still need more ESX servers!