Warning: Owners of HPE Solid State Drives Need This Updated Immediately
ATTENTION! Hewlett Packard Enterprise (HPE) recently announced a critical customer support bulletin regarding the expected failure of a wide range of the enterprise-class solid state drives currently being used in some of its products.
In the recently released report, HPE reveals the discovery of a firmware bug that will inevitably cause drive failures. If the drives in question have been in use for 32,768 hours or (3 years, 270 days 8 hours), they will fail 100% of the time.
The report titled, “HPE SAS Solid State Drives – Critical Firmware Upgrade Required for Certain HPE SAS Solid State Drive Models to Prevent Drive Failure at 32,768 Hours of Operation” states that the issue originates from the power-on counter firmware.
The firmware in question is used in solid state drives featured in HPE Synergy, Apollo, and ProLiant servers, Store Virtual 4335 and Store Virtual 3200 products. HPE provides a wide-ranging list of devices affected in the report.
According to the customer support bulletin, “Neglecting to update to SSD Firmware Version HPD8 will result in drive failure and data loss at 32,768 hours of operation and require restoration of data from backup in non-fault tolerance, such as RAID 0 and in fault tolerance RAID mode if more drives fail than what is supported by the fault tolerance RAID mode logical drive.”
This directly translates to the importance of data backup. Once the SSD drive fails, it will no longer be functioning for data storage. Restoring data will involve individual backups or a still functioning drive that’s part of the RAID.
The imminent failure is the result of a software issue. The power-on counter in the affected drives uses a 16-bit Two’s Complement. Once the counter exceeds the maximum value, it’s a hard fail.
This failure can be disastrous because the affected enterprise-class drives may have been installed as part of a multiple-drive RAID (Redundant Array of Independent Disks). The potential for all the drives to fail simultaneously (assuming they were all installed and activated together) is highly likely. It would be a catastrophic domino effect of data loss.