n a recent recovery, DriveSavers tackled a challenging RAID 5 failure. The 4x3TB array had entered Emergency Mode. The failure jeopardised 30 years of invaluable personal and professional data for the customer.
SSD: What Happens When System Area Corruption Occurs
Scenario: System Area Corruption
There are different types of corruption that can occur in an SSD. In this article, the specific type we will be discussing is more commonly known as “system area corruption.”
We’re all familiar with operating systems (OS) and file systems. What you may not be aware of is that the controller chip, which manages the NAND chips where data, programs, etc. are actually stored, has its own OS and file system that is different than that which the user interacts with when using the computer.
The file system used by the controller chip contains files specific to drive functions, such as firmware, the translator table, the defect table and others. When a user first presses the power button on an SSD device, the controller’s OS must walk through each of these files in order for the computer to boot up.
In a system area corruption scenario, one or more of the files used by an SSD’s controller become corrupt, keeping the controller from completing its walk-through and disabling the boot process. The drive is no longer able to boot up, otherwise known as “bricked.”
Causes
System area corruption can potentially be caused by one of the following three issues, or any number of other unknown triggers:
- Sudden Power Loss
There are several components involved in the travel of power from a wall outlet or battery to the SSD inside a computer. If any of these components fail, there can be a sudden power loss to the drive. SSDs that do not have a super capacitor or other technology to retain enough power to complete its write cycles are more likely to suffer in this scenario. - Bug
When written code encounters something it wasn’t designed to understand or deal with, corruption may occur in the system area. - NAND Flash Failure
The system area used by the controller during the boot-up process may be stored on and accessed from the NAND chips, the controller chip itself or distributed across both.
If there is a significant failure of the NAND flash, it is possible that the system area can be corrupted. This particular failure may also result in the loss of some of the actual user data.
What to do to Avoid Data Loss
Hopefully, the user has backed up the data that is on the device prior to the failure. Either way, there’s not much more that can go wrong. Other than through physical sabotage or securely erasing the SSD with the manufacturer’s toolbox software, it is difficult to make data any less recoverable once system area corruption has occurred. The damage is done, so to speak.
What Happens at DriveSavers?
At DriveSavers, we use tools that are proprietary and co-developed with SSD manufacturers to resolve corruption issues. There are no commercial tools available to address this specific problem.
How to Avoid System Area Corruption
It will help to regularly run the toolbox software that came with the SSD device, just to see what it might find. In fact, if a device appears to be bricked, you can try running the toolbox to see if it is able to work any magic. On some SSDs, toolbox software can run even when the drive is not visible; on others, it cannot.
Make sure to keep your SSD firmware up to date as it may prevent a bug from causing a system area corruption. This is because, as bugs are found and reported, manufacturers will develop patches and fixes and then distribute them through updates.