Solid State Drives offer benefits such as speed, low power consumption, durability and low weight. However, they have not yet become as widely used as traditional hard drives, primarily because they are much more expensive.
HDDs and the most common Solid-State Drives can be used relatively interchangeably. Their general form factors (2.5” or 3.5”), as well as the ways they connect to systems (usually SATA), are often identical, though many new form factors and connector types have emerged to take advantage of flash media’s speed and small size.
HDDs and SSDs differ in the way data is managed. It is far more complicated on a solid state drive than on traditional rotating media. NAND flash, the storage components used in SSDs, have some major limitations that must be managed, leaving room for manufacturers to fail in properly executing important functions that could impact performance.
Understanding NAND Flash
Reading/Writing by Page and Erasing by Block
NAND flash is organized into Pages and Blocks. Each Block contains a specified number of Pages (32 to 256), which in turn contain a specified amount of data (2 to 8KB). Unlike traditional rotating hard drives, data in NAND flash can’t be directly overwritten. When any data changes, it must be written again since it cannot be updated:
1. “Green” data is written.
2. As data is added to the Block, the Green data is updated. Because it can’t be overwritten, it must be written again, and the original data is marked for deletion.
3. This process is repeated every time the Green data is updated.
Also, while data can be read and written at the Page level, deletion can only happen at the Block level. In other words, you can read or write any number of Pages from a Block or to a Block, but if you want to erase any part of that Block, you have to erase the whole thing. What this means is that data must be constantly shuffled and re-shuffled in order to free up Blocks for reuse:
1. Brand new! All pages are empty.
2. Data starts getting written.
3. The block is full.
4. Data gets marked for deletion.
5. Data not marked for deletion must be combined with other data and “moved,” or rewritten, to utilize available space. Here, “good data” not marked for deletion is rewritten from the Blue Block and the Orange Block to an empty block with available storage space.
6. Once the good or valid data is rewritten to another Block, all the data in the Block can be marked for deletion.
7. Once all the data in a block has been cleared, it can be reused.
One byproduct of this constant shuffling is that, unlike traditional hard drives where the physical location of each bit of data is known and constant, the physical location of data in an SSD is highly abstracted from the outside world. Whereas each Logical Block Address (LBA) on an HDD always points to the same physical location, the physical location to which an SSD LBA points changes often.
As a result of these issues, the same data (and metadata) ends up being written over and over again. This is called “write amplification.” Write amplification wouldn’t be a problem, except that a block can only be used so many times before its chances of failure start increasing.
The Lifespan of Blocks
Each Block within a NAND flash component has a certain “lifespan” after which it becomes unreliable. This is often measured in program/erase cycles (P/E cycles). What this means is, if you continue to use the same Blocks over and over again, they will fail earlier than other blocks that have not sustained as much use.
The File Translation Layer (FTL)
Manufacturers use programming called the File Translation Layer (FTL) to ensure that their SSDs perform to a desired level while managing these limitations. This code is embedded in the SSD and controls the data coming in and out of the drive.
Each manufacturer’s FTL is a highly-proprietary “secret sauce” of algorithms that handle how data is managed in the drive. The FTL handles garbage collection to determine when and how to free up blocks to be erased and reused. The FTL handles wear leveling to spread P/E cycles across available blocks evenly to increase the longevity of the drive. The FTL also handles many other necessary functions.
Since the efficiency, performance and endurance of an SSD drive is determined in large part by how the FTL works, this is one of the primary places where manufacturers can differentiate themselves. It is also where manufacturers can fail in properly executing important functions.
Test and Verify
As you can see, the nature of the NAND chips used in Solid State Drives creates the need for complex data manipulation in order to improve the performance of the drive to an acceptable level. The exact algorithms for handling this manipulation are the responsibility of the FTL, which differs from manufacturer to manufacturer and from model to model. Luckily, these complex algorithms are invisible to most users. However, they DO have an impact on certain areas such as digital forensics and data recovery. More on that in a future post…
Despite the challenges of properly managing data, the barriers to entry for the SSD market are low, as the components necessary to build an SSD are widely available. Unlike the hard disk drive market where there are only a few major manufacturers, the SSD market has many manufacturers, both experienced and inexperienced. If you choose to invest in SSDs, you now have some understanding of the pitfalls that can come along with the benefits they provide. Make sure to test and verify SSDs that will be used for important applications or used in any form of roll out.