Troubleshooting ESX logs
Another session I attended at VMware’s Partner Exchange last week was titled ESX Log Analysis – Tech 207. I did not realize it when I signed up, but this was essentially the same session that I previously attended at VMWorld 2007 last September. I did a quick Google search on this topic to find the VMWorld slides and noticed that Scott Lowe live blogged from San Francisco while attending this very session. Then Searching on the VMWorld.com site I found that this was also a session at VMWorld Europe 2008 titled VI3 Advanced Log Analysis. You can get a copy of the .ppt used at the VMWorld Europe 2008 session on my Files Page.
There is nothing really too new about t-shooting ESX logs here, but the following are my notes from last week. On the other hand, there are some general notes directly related to ESXi logs and using Update Manager included.
I cleaned up my notes a little, but the following is still a raw outline. use These notes and the .ppt mentioned above to hopefully help educate yourself on this topic.
Use VI Client connected to VC to get ESX logs (GUI for vmsupport dump script)
- Log in to VC with VI client as administrator
- File > Export > Diagnostic Data
- Select servers from which to collect logs
- Select include info from VC check box
- specify location for storing files – gzip format
Dumps include the following ESX 3/3.5 logs mostly found in /var/logs. Paths are relative to this directory
- vmkernel (vmware/esxcfg-boot.log)
- messages (vmware/esxcfg-firewall.log)
- dmesg (vmware/esxcfg-cim.log)
- boot.log (vmware/esxcfg-linuxnet.log)
- initrdlogs/* (vmware/esxupdate.log)
- vmksummary (oldconf/esx.conf*)
- vmware/hostd.log (rpmpkgs)
- vmware/vpx/vpxa.log (vmkernel-version)
The esxupdate script is based on yum, but only utilizes yum infrastructure. Esxupdate has been replaced by the Update Manager (UM) and esxupdate is now the foundation for UM. If the patch includes an update for esxupdate it updates itself. The log tells you what was installed via esxupdate.
ESX 3i log files
They are not found on the flash memory / drive
- config.log (vmware/hostd.log)
- messages (vmware/aam/*) -HA logs
- slpd.log (vmware/vpx/vpxa.log) (VC management agent)
restart hostd from 3i GUI or via service console – tendency to get stuck – mngmnt services.
For ESX3i the log rotation is different – The current log is the one without extensions and then the next is “.0.gz”
VMkernel Log (3.0.x/3.5)
located in /var/log/
all events generated by vmkernel
vmkwarning log is subset of warning events only
rotated with numeric extensions. current log is without extension and next newest one with “.1″ extensions
all events since last vmkernel load are also in memory in /proc/vmware/log
This log can have other entries from sources beside vmkernel – such as processes running in vmkernel
The numbers in this log are not error codes, but they are CPU:World ID, driver instances, or number of line in VMware code. When t-shooting use the CPU:World ID # and find the names file from /proc/vmware to figure out the VM having errors, and then go to the volume folder of the vm (folder on the VMFS where the .vmx, .vmdk files are found) and examine the log file there for errors.