vsphere_static_160x300
Free Business and Tech Magazines and eBooks
Badges

vexpert_logo_100x57

gestaltitbadge

follow-me-twitter

Subscribe to me on FriendFeed

Comments / DISQUS
Feedjit.com

Deploying VMware in a Linux Shop #PO2575

This session was my last VMworld 2008 session on Tuesday 9.17. I must have missed it in my notebook Tuesday night, so I am posting my notes now. The session was hosted by Mike DePetrillo, Principal Systems Engineer at VMware. Mike did the entire session on one leg. If you saw one of Mike’s sessions or you know Mike you’ll understand that comment. :)

This session was designed for companies that are primarily Linux shops and have numerous virtual machines (VMs) on VMware virtual infrastructure. Mike provided general information about Linux as a guest OS as well as some best practices and performance tips for both the VMs and the ESX hosts. The rest of this post is my notes from the session.

Mike started out by talking about general recommendations for building Linux VMs.

  • When building VMs for Red Hat Enterprise Linux (RHEL) VMs use a minimum of 512 MB of RAM for best VM performance. For RHEL 3 and RHEL 4 use a minimum of 256 MB RAM.

    • Over allocating memory for VMs can cause performance issues

  • Always use the LSI Logic SCSI adapter with Linux VMs as many Linux distros will have issues with the Buslogic adapter

  • Always install VMware Tools for best performance. Mike gave us a breakdown on what services and drivers are installed

    • vmware-guestd – the VMware tools service

    • vmware-user – cut and paste feature

    • vmblock – mounts the file system for the drag and drop feature

    • vmhgfs – shared folders service

    • vmmemctl – the memory ballooning driver that assists ESX memory management

    • vmxnet – the network driver

    • vmsync- the service that freezes and then thraws the file system for snapshots

VMware Tools are now available as open source, and many Linux distros will be including them as standard packages included as part of the base install in the future. I learned in an earlier session that one of VMware’s objectives for the future was to work with the various distros to provide Tools updates via the package managers as well. Being able to use yum, apt-get, or even Synaptic would be a huge help in my opinion and I look forward to when this heppens.

Mike then introduced a scripted method of automating the process of updating VMware Tools for the mean time. At a high level, the script is scheduled via chron to check for the presence of an updated kernel, and if the condition is found a second script invokes the installation routine to recompile the tools correctly with the new kernel. Mike said he would provide the script on his blog for download. Go to http://www.mikedipetrillo.com for more information. I posted about a similar scripted process not too long ago as well.

Mike then discussed the time keeping issues of Linux VMs. My memory is a little fuzzy on Mike’s explanations of both slow and fast time sync issues with Linux VMs, but Mike pointed to 2 VMware knowledge base articles for some known fixes today

To provide additional performance tweaks for Linux VMs Mike explained removing certain useless services and disabling some features were recommended.

  • Smartd

  • cpuspeed

  • disable screensavers

  • get rid of hardware management agents from P2V conversions

  • get rid of NIC teaming agents and software

Mike said that there was a whitepaper for optimizing Linux VMs that would soon be published.

Mike then began talking about automating deployment and administration of Linux VMs. He discussed a PXE boot method of deployment as well as some general ideas about guest customization. Once again the scripts and the documentation for these processes are available on his mikedipetrillo.com blog. Mike used a recorded PXE installation as a demo.

Finally, Mike provided some best practices and recommendations for optimizing ESX performance in a Linux shop.

  • Use SMP sparingly. There us usually little need to create VMs with multiple virtual processors. Even adding 2 CPUs to multiple VMs could create scheduling problems and cause a condition where VMs are ready to execute CPU commands but have to wait for availability to the ESX hosts physical hardware.

  • Do not pin CPUs to VMs.

  • Do not exceed 768 MB of RAM for Linux guests. I personally know of several customers that do this, and I am not clear as to what Mike’s recommendation prevents. Bottom line is that using less ram in any VM will increase the number of VMs that can coexist together on a ESX host and also keep the ESX server as efficient as possible

  • Reduce Linux VM swapping however possible

, ,

Related Posts

  • graeme
    These two statements seem contrary:
    - "Do not exceed 768 MB of RAM for Linux guests"
    - "Reduce Linux VM swapping however possible"

    What happens when my workload uses more than 768 megs of ram, I will swap. If your workload needs more
    memory, your host has capacity and you aren't at
    a vmware archtitectural limit, add it.

    I can't wait till hotswap ram comes arrives.
  • graeme,

    Yes, I agree the statements are contrary for an individual VM's OS. I think the statement is aimed at the sum of all the VMs in the virtual environment. In other words, keep the bulk of your Linux VMs below 768 MB of ram with minimal swap files. Only create VMs with larger memory and swap for applications that truly utilize the extra RAM. Remember that each VM also has an ESX swap file in it's volume folder too. Assigning excessive amounts of RAM to VMs will increase the overhead ESX needs to do it's job as well.

    Too often administrators are used to the physical server mentality of more ram makes a server better. With VI less is more not just at the ESX host level, but at the VM level too.
  • Chris
    OK, so I'm back to being confused. For starters we're a pretty big unix shop. The only reason I use windows at all is to run the VI client, in fact.

    We have several application running on Linux, most notably Oracle, tomcat, and apache.

    All three of these are multithreaded and benefit from multiple cpus, so why would making the vms have 2 vcpus be so bad?

    Also, in the case of Oracle, more memory is certainly better, why would I reduce it to the bare minimum? If I assign too much memory, it's not like it's being locked up just for that machine.

    I'm genuinely confused.
  • Chris,
    there is an interesting but a little bit outdated (it's from 2006) paper from IBM about websphere performance on ESX: http://www-900.ibm.com/cn/crl/download/ESX_WAS_...
    In this paper there is a comparison of running multiple single vcpu vms vs. running smp vms with the same total number of cpus. The result is that single cpu vms perform about 20% better than vms with multiple cpus.
  • Chris,

    Running 2 vCPUS on your VMs is not a bad thing by itself, but isolating your 3 VMs on the same host could be. For example, if your 3 SMP VMs are all hosted on a host with only 2 physical processors you could create a cpu ready time issue as your VMs are ready to execute but can't get scheduled time to the physical CPUs. That's a small, simple example.

    Check out the VMware Ready Time Observations white paper for more information. http://www.vmware.com/resources/techresources/641

    "To achieve best performance in a consolidated environment, you must consider ready time — the time a virtual machine must wait in a ready-to-run state before it can be scheduled on a CPU. This paper provides information to help you understand the factors that influence ready time on an ESX Server 3.0 system."
  • J
    I agree that using single vCPU VMs is more efficient, but in a Linux VM, that means that you must load the normal kernel, not the SMP one, doesn't it?

    I mean, the VM configuration (one vCPU) and the configuration of the OS inside the VM MUST match, don't they?

    Thanks for the answer,
    J
  • J,

    Great point. Yes they must match. This is most frequently discussed with Windows VMs since the majority of P2V conversions are automated for Windows servers which result as SMP VMs. The HAL must be switched to use a uniprocessor driver. The same is true for Linux VMs, but because there are so few P2V tools that can do a Linux P2V a net new VM is often built and the apps and data are transfered.

    Here's a post on the process for Windows VMs I did earlier this year:
    http://vmetc.com/2008/06/11/how-to-p2v-multi-pr...
  • I'm not a Linux expert, but I believe the recommendation concerning RAM usage is about how Linux uses free memory as file system cache. So if you look at the amount of RAM in use in VMtools will include the amount in use as cache. I think they are hinting that you don't want to waste server RAM real estate on file system caching. But I agree with the statement that if your workload requires it you should configure more to avoid paging.
  • I found it surprising that Mike didn't provide the Linux timekeeping KB link that's actually been useful in my experience:

    http://kb.vmware.com/kb/1006427
  • Just stumbling across the nice set of notes from my presentation at VMworld. It seems there's some confusion on a couple of points so let me clarify.

    1) If you need more than 768 MB of RAM then go ahead and allocate it. Jason Willey's comment hit the nail on the head. What I meant to say was there are a lot of Linux VMs out there doing simple tasks that were given 1 GB of memory as a default install and they really don't need it. Once you break the 768 MB threshold in most Linux distros then you end up allocating another chunk of memory for file system cache. Staying under 768 MB if your VM doesn't need it prevents the extra memory usage for file system cache which means you can get more VMs on a host (generally memory is the limiting factor to VM density).

    2) Nearly all apps out there will scale better when you stack multiple single CPU VMs on a host versus fewer multi CPU VMs - even apps like databases that like multiple CPUs. The reason is writing a true, preemptive multi-processing, multi-threaded application that behaves properly with the underlying operating system's scheduling is very difficult. Whenever you add more CPUs to the mix from a scheduling standpoint whether it be physical or virtual you add overhead since you must keep the multiple CPUs in sync. Scaling out with multiple single CPU VMs allows for more flexible VM scheduling and also removes the overhead in a multi processor Guest OS and App stack. Now, if you want to scale out with multi CPU VMs that's fine. I was simply making the point that VMs allow you to think differently and scale differently. Of course it's a though process that's uncomfortable for most.

    3) I'll be posting the PXE booting of ESX soon.

    4) The correct URL for my blog is http://www.mikedipetrillo.com.

    5) Andy Leonard posted a great URL that I had forgotten to hand out before for time sync in guests. Make sure to check it out in the comments above.

    I hope that clears some things up.
  • Mike,

    Thanks for the clarifications, and I changed your URL in the body of the post. Did it recently change?

    Most importantly, thanks for the great VMworld presentation!
  • J
    Mike, I agree with you in the 2nd issue, "scaling out with multiple single CPU VMs allows for more flexible VM scheduling and also removes the overhead in a multi processor Guest OS and App stack".
    For examnple , if we have a host with two dual core processors, one thing I observe (both in Windows and Linux) is that if I put a single vCPU in a VM the VM's performance increases. That's OK. In the performance tab in VIC, we can see the VM reaching nearly 100% of CPU utilization when we stress/test it. And it takes less time to do things cpu-intensive like compilations, etc..

    But at the same time if we look at the performance tab of the HOST we can see that tfor a 100% of VM cpu utilization the host's graph only reaches 25%!!!
    That is because the VM is stressing only one core (2 processors * 2 cores = 4. So 1/4 = 25%) ?????

    Can we achieve more utilization?

    PS1: I've checked the affinity of the VM and It can be ran in all the cores.

    PS2: It is true also that with 1vCPU, the %RDY parameter (esxtop) decreases wich is good. What is the good value for %RDY? Less than 5%?

    Thanks for your time!
    J
  • Good questions, J. vCPUs (virtual CPUs) get directly mapped and scheduled onto lCPUS (logical CPUs - aka cores or hyperthreads). Since you only have 1 vCPU in the guest you're only going to use 1 lCPU or core on the host. We don't split threads - no x86 virtualization solution does. If you had more guests then host utilization would increase. Of course if that same VM had 2 vCPUs then you would see 50% utilization on the host since you'd be using 2 lCPUs at the time. Actually, you'd see a little more than just 25% or 50% both times because you have the Service Console of ESX which gets scheduled on one of the cores (unless you're using ESXi), the vmkernel (the traffic cop) gets scheduled on one of the CPUs, and the invisible helper world (the I/O path) for that VM gets scheduled on a CPU. So in theory if you only had 1 VM running on the host and it had 1 vCPU then it would use 1 core, the vmkernel would use 1 core, the Service Console would use 1 core, and the helper world would use 1 core. All 4 cores would be getting exercised with just 1 VM running with just 1 vCPU. A lot of people don't realize this. This brings us to %RDY.

    %RDY (READY spelled out) means the percentage of time that a VM is ready to run but can't run because it can't get onto a CPU. Basically the VM was next up in line to get scheduled but all of the lCPUs were busy. The higher the %RDY the more contention you have on the host. There are 2 things you could do to reduce %RDY:

    1) Reduce the number of vCPUs in your guests.I see the case all the time where people create 2 vCPU guests by default whether the app needs it or not. This is because they had 2 CPU physical hosts so they're just following what they did in the physical world. The problem is 2 vCPU guests take up twice as many lCPUs when scheduled. The more lCPUs in use with more VMs means someone has to wait and %RDY goes up. This is just fine most of the time but I generally say if you see %RDY higher than 50% then you should look at your VMs and determine if they really need that other CPU. If you decide that you can knock down the vCPU count in a VM make sure to change the HAL in Windows or kernel in Linux to match the single CPU VM. Going from multi-proc HAL/kernel to single-proc HAL/kernel is easier (and supported) in the Linux world than the Windows world.

    One clue that the VM doesn't really need the second vCPU is to look at %WAIT. If %WAIT is higher then 30% there's a good chance the second CPU is just sitting idle when the VM gets scheduled.

    2) The other thing you can do is buy another physical host to increase the core count in your cluster.

    NOTE: The recommended percentages I'm giving you for %RDY and %WAIT are my personal recommendations just based on seeing this stuff for 6 1/2 years at various customers. You may feel comfortable with lower limits. I know of one customer that won't go over 30% for %RDY or 10% for %WAIT. I'm just telling you what to look for. With some practice tuning in your environment with your workloads you'll be able to find out the exact threshold you need.

    I hope this helps.
  • Interesting recommendation for %ready times. I've always heard / read that double digit ready time was bad (so >10%). As always, it's relative to the application on each VM and the user experience impact. More on this topic at http://communities.vmware.com/docs/DOC-7390.
  • Rich,

    It does all come down to comfort level and number of VMs. Obviously if you have 40 VMs on a host and 50% of the time they're getting blocked from running you have a problem. Of course if those 40 VMs really weren't doing anything all the time (maybe you have 40 idle print servers or DHCP servers or something) then 50% ready really isn't a big deal. I too used to say anything over 5% ready is bad. I've slowly increased it over time to the 50% threshold. Again, these are my recommendations - not VMware's. You'll have to find something that you're comfortable and as always LOOK and UNDERSTAND your applications. I've seen too many customers that have no clue what they're app is doing or is supposed to be doing and then they want to start troubleshooting at the virtualization layer. Bad idea.
  • J
    Thanks Mike and Rich,vary very interesting.
    It was exactly what I was supposing. I'm seeing 20% RDY (average) in my hosts (sometimes 45%...). Most of the VMs are configured with 2vCPU and are VMs with little load.
    The extrange thing is that I'm seeing 400 or 500 or 600 as values for the %WAIT parameter!!
    Is it a bug? I've to read this like 40 or 50 or 60% values? May be I 've to read the %TWAIT or %BWAIT calues insted the %WAIT value?
    Or may be it's correct.
    This seems to happen both in ESX 3.0.2 and ESX 3.5.

    TIA,
    J
  • J, if you want you can email me off-line to figure out what's going on: http://www.mikedipetrillo.com/about.html. Where are you getting the values from? Virtual Center? esxtop?
blog comments powered by Disqus
Hyper9 Cowabunga
Support VM /ETC
Support VMETC.com

Support VMETC.com

@rbrambley tweets
Advertisements
VMTN Roundtable Podcasts
Subscribe



Add to Google Reader or Homepage
Subscribe in NewsGator Online
Add to netvibes
Add to Plusmo