Saturday, January 12, 2013

Thanks for the Memories ... Part Deux


I described the evolution of memory and external storage and set the stage for the introduction of "Solid State Drives" in my previous article. You can read this earlier note by clicking here.

Now I would like to explain more about these new combinations of memory and external storage and how they are specified.

It is a fact that disk drives have shrunken from 12 inch platters to 5.25 inch to 3 inch and even as small as 2 inch drives. But these are still not small enough to fit in a typical cell phone. These disadvantages, coupled with advances in memory designs, have changed the balance between memory and storage. We know that non-volatile memory chips now exist. We see them every day in use in cameras and smart phones. They have been attached to USB plugs to create “memory keys” or “plug-in drives” or “USB drives” or several other names.

The result is that hard disk external storage is being replaced by solid state memory producing the SSD or “Solid State Drive.” These fast, non-mechanical drives are starting to appear in computers, especially small, light, mobile computers – laptops. The Apple MacBook Air is famous for it’s SSD as are the Windows laptops that coped the Air’s design – the so-called “Ultrabooks.” Most of the new Chrome Books also have SSD drives.

I’ve got a lot of computers. Counting little things like my Raspberry Pi and laptops, as well as my larger desktop systems and servers, I have over a dozen computers here at home. Four of them use SSD drives. The first thing you notice with an SSD is a much faster booting time. My MacBook Air boots in 12 seconds and my Chrome Book boots in 10 seconds. Of course the small size and the resultant longer battery life are obvious advantages too. But one thing funny about SSDs is they have reversed the simple math that we use to calculate  and advertise storage capacity.

For maximum efficiency, all things digital are based on the powers of 2. Now if you raise 2 to the 10th power, you get 1024. That’s pretty close to one thousand, or as engineers say, a kilo or “k.” Computer data is often organized into 8-bit packages called “bytes.” A byte can store a single character like the letter ‘a’ or a capital ‘T’ or the pound sign ‘#.’ A kilobyte is 1024 bytes. Continuing with this combination of decimal math terms and binary or powers of two we get a megabyte with is 1024 kilobytes or 1,048,576 bytes. Notice that number is called a megabyte, but it is quite a bit more than that.

So when you buy a 500 megabyte storage drive, you actually get quite a bit more than 500 million bytes. Of course, these days you would get gigabytes or even terabytes, but the math expansion is the same. You get more than the name implies. Sort of like a continual 10% off sale down at the local clothing store or buying a 12 oz. latte and getting 16 oz. for the price.

Now there is some overhead in drive design. Not all the storage space is used for storing your data. Some of the space on a disk drive is used to store information about where things are stored. In the early days of hard disks, IBM invented the term VTOC or Virtual Table of Contents. I actually worked for the engineer that had invented that term -- a little bit of history for folks from the fifties. 

The original PC DOS used a system called FAT for File Allocation Table. There is a directory area plus two FAT areas (an extra for redundancy). But the customer still gets well over the advertised storage with the numeric advantage of the power of two calculations.

I won’t get into details, but comparing the directory and FAT to a table of contents is a good analogy. Just as you would refer to a table of contents to determine what page a particular topic is located, so an area on the disk is reserved to store the locations where specific files are stored. Still, even with this overhead, a gigabyte disk usually has more than a billion bytes capacity. You continue to get more than paid for, or – at least – advertised. Not so with SSDs. They are “over-provisioned.”

The over-provisioning of NAND flash memory in solid state drives (SSDs) and flash memory-based accelerator cards (called a cache) is a required practice in the storage industry owing to the need for a controller to manage the NAND flash memory. This is true for all segments of the computer industry—from Ultrabooks and tablets to enterprise systems and cloud servers.

Essentially, over-provisioning allocates a portion of the total flash memory available to the flash storage processor, which it needs to perform various memory management functions. This leaves less usable capacity, of course, but results in superior performance and endurance. More sophisticated applications require more over-provisioning, but the benefits inevitably outweigh the reduction in usable capacity.

NAND flash memory is unlike both RAM (random access memory) and magnetic media, including hard disk drives, in one fundamental way: there is no ability to overwrite existing content. Instead, entire blocks of flash memory must first be erased before any new pages can be written.


With a hard disk drive (HDD), for example, that act of “deleting” files affects only the metadata (VTOC) in the directory. No data is actually deleted on the drive; the sectors used previously are merely made available as “free space” for storing new data. This is the reason “deleted” files can be recovered (or “undeleted”) from HDDs, and why it is necessary to actually erase sensitive data to fully secure a drive.

With NAND flash memory, by contrast, free space can only be created by actually deleting or erasing the data that previously occupied any block of memory. The process of reclaiming blocks of flash memory that no longer contains valid data is called “garbage collection.” Only when the blocks, and the pages they contain, have been cleared in this fashion are they then able to store new data during a write operation.

(Note the distinction. Magnetic media can be “overwritten.” That means you can write over the old data, effectively erasing what was there before. NAND flash memory must be erased before it is overwritten. That is one reason it is suggested you not delete individual pictures off a camera storage card. Instead, after copying the pictures to your computer, you should just reformat the memory card, thereby erasing everything in one operation. That is both more reliable and will make the memory card last longer.)

The flash storage processor (FSP) is responsible for managing the pages and blocks of memory, and also provides the interface with the operating system’s file subsystem. This need to manage individual cells, pages and blocks of flash memory requires some overhead, and that in turn, means that the full amount of memory is not available to the user. To provide a specified amount of user capacity it is therefore necessary to over-provision the amount of flash memory.

The portion of total NAND flash memory capacity held in reserve (unavailable to the user) for use by the FSP is used for garbage collection (the major use); FSP firmware (a small percentage); spare blocks (another small percentage); and optionally, enhanced data protection beyond the basic error correction (space requirement varies). These are all “overhead” functions that require space to operate efficiently, but are not used for direct storage.

Even though there is a loss in user capacity with over-provisioning, the user does receive two important benefits: better performance and greater endurance. (Endurance is how many times the drive can be written to before failure. There is an intrinsic limit to how many erase cycles a solid state memory can perform before failure.) The former is one of the reasons for using flash memory, including in solid state drives (SSDs), while the latter addresses an inherent limitation in flash memory. 

(Note that, although HDDs are mechanical devices which do wear out and have a mean-time to failure, SSDs also have a sort of electronic "wear" and a similar failure statistic. But, rather than time it is number of erasures.)

The equation for calculating the percentage of over-provisioning is rather straightforward:


Percentage Over-provisioning =
 Total Capacity – Available Storage Capacity / Available Storage Capacity

For example, in a configuration consisting of 128 Gigabytes (GB) of flash memory total, 120 GB of which is available to the user for storage, the system is over-provisioned by 6.7 percent, which is typically rounded up to 7 percent:

128 GB – 120 GB / 120 GB = 0.067 or 6.7%

As SSD capacity increases, this loss of useful storage will be less significant. For now, it does provide a disadvantage when compared to an HDD in sizes like 64 and 128 GB. With 256 GB and larger SSDs, this will matter less.

No comments:

Post a Comment