ESXi/ESX 3.5 Update 3 iSCSI and FC Alert – Queue for device has been blocked
A few virtualization bloggers have reported that they received an alert email from VMware about an I/O failure issue involving iSCSI or Fiber Channel (FC) SANs. There is also an alert currently dispalyed at http://www.vmware.com/support. In summary, an indefinite block occurs between ESXi/ESX 3.5 Update 3 hosts and VMFS 3 Luns which results in all paths to the storage entering a standby state. The issue is apparently isolated to the Update 3 version only.
Eric Sloof is one blogger that received the email and he has published his copy on his NTPro.nl blog. Here’s a brief quote from the email about the issue:
PROBLEM STATEMENT AND SYMPTONS:
- ESX or ESXi Host may get disconnected from Virtual Center
- All paths to the LUNs are in standby state
- Esxcfg-rescan might take a long tome to complete or never complete (hung)
- VMKernel logs show entries similar to the following:
- Queue for device vml.02001600006006016086741d00c6a0bc934902dd115241 49442035 has been blocked for 6399 seconds.
- Please refer to KB 1008130.
A reboot is required to clear this condition.
VMware is working on a patch to address this issue. The knowledge base article for this issue will be updated after the patch is available.
VMware KB 1008130 is titled VMware ESX and ESXi 3.5 U3 I/O failure on SAN LUN(s) and LUN queue is blocked indefinitely and provides the pattern of vmkernel messages that identify you have this issue:
Error messages matching this pattern are repeated continually in vmkernel:
vmkernel: cpu6:1177)SCSI: 675: Queue for device vml. has been blocked for 7 seconds.
vmkernel: cpu7:1184)SCSI: 675: Queue for device vml. has been blocked for 6399 seconds.
As stated in both the email and the KB article, unfortunately the only solution is to reboot your ESXi/ESX 3.5 Update 3 hosts until VMware is able to provide a patch.
[ad] Empty ad slot (#1)!