Testing
Testing... We have heard of it.
Oh, I forgot: it's all 'cos 'we aren't an OS company'
Well, if you are not an OS company why are sticking your grubby incompetent fingers anywhere near a driver image, may I ask?
Here's a fantastic fail: HPE's July ServicePack for ProLiant servers bricked some network adapters so badly they “must be replaced.” An advisory issued late last week explained that the mess is triggered “when a driver upgrade is performed with … HPE QLogic NX2 1/10/20 GbE Multifunction Drivers for VMware vSphere 5.5, 6.0, and …
Testing... We have heard of it.
Oh, I forgot: it's all 'cos 'we aren't an OS company'
Well, if you are not an OS company why are sticking your grubby incompetent fingers anywhere near a driver image, may I ask?
"Well, if you are not an OS company why are sticking your grubby incompetent fingers anywhere near a driver image, may I ask?"
Since when is it down to the likes of MS, VMware and Red Hat as "OS companies" to develop drivers for everyone elses' hardware?
So to answer your question - probably the same thing that Intel, AMD, Nvidia, Brocade, Broadcom, QLogic, Cisco, Emulex, Dell, Realtek et al are doing with their fingers near a driver image for their hardware.
Also...yet again...wrong HP!
But what about all that R&D? You don't think the firmware updates and HP logos on third party network cards appear there all by themselves?
Of course not... It takes months or years of staring at ones navel before before the correct font and colour can be selected, which is why the problems the third party fixes take that so long to appear on Dell/HP/Lenovo server kit.
...is that if you were wanting to DDOS a company that was using this equipment, your life got a whole lot simpler.
Subtext: It should not be possible to be able to do this. (In the old days I seem to remember that you would have a jumper on the PCB to prevent such mischief).
In the old days I seem to remember that you would have a jumper on the PCB to prevent such mischief
In the old days SysAdmins usually knew what a jumper was. Many could even follow instructions like "pull up lines 5 and 36 with a 1k5 while applying power" and might even own soldering irons.
In the "old days", firmwares were much smaller, simpler and less prone to requiring patching. Most of the "brains" was in silicon so there wasn't the need to drop firmware as much. These days, the custom silicon is expensive, coding firmware is cheap so bugs creep out and updates are required.
Add in scaling issues - if all you had was a single large Unix server, flipping the jumper is relatively trivial. With 1000+ servers in VMWare farms/private clouds, flipping all the jumpers becomes time consuming.
To be fair, there probably are jumpers, they're just set to allow updates for the reasons above.
Earlier this year they bricked a whole bunch of laptops with an update, they already bricked Proliants in 2014 with an update, marvellous QC there chaps...
Once again El Reg likes to post incomplete information.
For all you people people blaming lack of testing, the combination that bricks the NIC is when you use a brand new driver with firmware that is like 2+ years old.
If you follow the DOCUMENTED Recipe for Drivers and Firmware, you'd be fine.
Image and SPP was pulled to prevent customers who don't RTDM from hurting themselves.
"...Once again El Reg likes to post incomplete information.
For all you people people blaming lack of testing, the combination that bricks the NIC is when you use a brand new driver with firmware that is like 2+ years old.
If you follow the DOCUMENTED Recipe for Drivers and Firmware, you'd be fine.
Image and SPP was pulled to prevent customers who don't RTDM from hurting themselves.."
It doesn't matter. This should NOT be possible.
If that particular combination is known to cause a problem then why doesn't the update stop and warn you before quitting?
The simple fact that it's a know, documented, issue just shows sloppy programming.
This.
But more precisely - why does a driver let you update it except against known-good firmware?
Quite literally "Sorry, you have to update driver firmware to continue, to at least X.X.X which has been tested with this driver".
If the ***only*** official way to do it is to update the firmware and then the driver, the driver should be checking that the firmware is up-to-date and refusing to continue.
And I'll tell you the answer - because they will break as many systems that way as any other. People will be stuck on old firmware/drivers because of a bug in or one or the other that they know hits them elsewhere, so they don't upgrade at all, rather than risk having to do both.
But, honestly, with this kind of kit - you literally say "Not a supported configuration" in your update tool, and then offer the path to get a support configuration (i.e. update the firmware first, then the driver). At this level, if it's not been tested, it shouldn't be possible.
"The simple fact that it's a known, documented, issue just shows sloppy programming."
If I had a penny for every nasty bug that got classified as WONTFIX... Would be enough to buy a pint or two.
What is particularly nasty is dropping already sold configurations from the 'supported' list (and subsequently from internal 'to be tested against' lists if one is to believe that such beasts exist at all).
Heh, one case was quite egregious. Support for one particular FC adapter/switch combination was dropped after 2-3 years in the field. Without explanation of reasons. Documentation was changed without keeping a proper list of changes. Support drones telling with a straight face "no, it has never been supported".
Except...there happens to exist a printout of SAN support matrix from the time of purchase. Because it's only too sensible to assemble your Fibre Channel setup from supported components and keep a bit of local documentation about it.
The point I was making here was not so much that updating to an incompatible overall configuration should not be possible, more I was making the point that anyone can rewrite the firmware to do whatever they wanted. OK a rogue techie could do that if they had access to the NIC jumpers (in the olden days), but a typical corporate with concerns about security should really have tamper seals and/or locks on system units.
Sysadmins concerned about rogue NIC's would have to be able to perform MD5 hashes on NIC card firmware for all pc's and have a utility to lockout NIC's with unauthorised MD5 hashes. Just changing the NIC to, for instance, fool around with ARP would be pretty devastating if programmed in, particularly if the lockout method relied on ARP to do its job.