My thoughts on the reactions to the ESX 3.5 Update 2 BUG
The product expiration time bomb that was mistakenly left in the first versions of the ESX 3.5 and ESXi 3.5 Update 2 download media is no doubt an embarrassing and horrible mistake by VMware. The timing of this disaster couldn’t be worse with Microsoft Hyper-V, Citrix XenServer, and others starting to be considered as an alternative virtual infrastructure platform for companies just beginning to explore the benefits of virtualization. How could this have happened and what are some lessons to be learned, not just for VMware, but for VI administrators around the world?
First this was an internal VMware blunder
At some point over the last 2 days I read where VMware publicly admitted they had a hole in their regression testing process. I can’t seem to find that comment today. I assume that the KB articles have been updated to include updated and more important information for VMware customers. I know I read it because I am not a developer and I turned to Wikipedia to understand what regression testing is.
Regression testing is any type of software testing which seeks to uncover software regressions. Such regressions occur whenever software functionality that was previously working correctly stops working as intended. Typically regressions occur as an unintended consequence of program changes.
Common methods of regression testing include re-running previously run tests and checking whether previously fixed faults have re-emerged.
Obviously this process did not work correctly at VMware for Update 2, and to their credit they have been open and honest about that fact. Evidence of the continued honesty can be found in their recent communications.
From my post yesterday Patch for ESX 3.5 U2 BUG promised by 6:00 PM today in the FAQ section at the end of the email:
We are making improvements on all fronts. The product team had endeavored to deliver a release with support customers deem important. But we fell short and we are deeply sorry about all the disruption and inconveniences we have caused. We have identified where the holes are and they will be addressed to restore customers’ confidence.
From the current version of the original KB article about this issue at http://kb.vmware.com/kb/1006716
The problem is caused by a build timeout that was mistakenly left enabled for the release build.
I am satisfied with VMware’s attitude, intentions, and actions to support customers world wide and resolve this issue. Frankly, I am bit surprised with the unrealistic expectations and sense of injustice expressed by some VMware partners and customers that anonymously left comments on the VMware Communities thread VMware Communities: BIG bug in ESX 3.5 Update 2 – If you’re …
Secondly, Update 2 was widely implemented because of exciting, new features in ESX 3.5 and ESXi 3.5.
Although Update 2 contained patches, the mass appeal that motivated IT departments around the globe to apply the upgrade was the new features. There has been a lot of rhetoric on numerous posts and threads with a theme similar to “how could smart administrators roll this out in production so quickly? Shame on them!”. Don’t kid yourself. Update 2’s appeal was not for the patching and furthermore, VMware has made the upgrade / patching process so simple that we all let our guards down. What other product can you think of that allows you to migrate production workload over to other hosts so easily and let you make these changes during business hours? I have to believe that this bug’s impact is so widespread because it has been so easy to upgrade and patch ESX in the past without consequence.
I also want to point out that the bug is not a result of current or new ESX functionality.
Look in the mirror and examine how we all helped make this happen.
First thing we all need to realize is that although VMware ESX and ESXi is an operating system and can be considered software, when we migrate all of our production systems to VI 3.5 Enterprise, Microsoft Hyper-V, Citrix Xenserver, etc. it is no longer that simple. If you haven’t already, realize now your entire business infrastructure is in a virtual data center created by this software. The implications of this are painfully obvious today. Yes, VMware’s success has made this possible, but is it any different for any other virtualization vendor’s products?
Go ahead, be frustrated with VMware, but be ANGRY with yourself. Use that emotional energy to make sure this doesn’t happen again regardless of the virtualization platform you use. Get your internal change control process in check.
Another good opinion blog post about the reaction to this bug was written by Matthijs Haverink at Virtualfuture.info:
Sure, VMWare made a (critical) mistake, but what’s all the fuzz about ? | Virtualfuture.info
Related Posts
-
Chris
-
Matthijs Haverink
-
Phil
-
rbrambley
-
James Shelton
-
rbrambley
-
James Shelton
-
Jason Willey










