Product and service reviews are conducted independently by our editorial team, but we sometimes make money when you click on links. Learn more.
 

SMR (Shingled Magnetic Recording) 101

SMR (Shingled Magnetic Recording) 101
By

SMR is a new HDD recording technology designed to increase density beyond the limits of traditional Perpendicular Magnetic Recording (PMR).

However, the architecture also creates a new set of challenges that affect both performance and compatibility. SMR (Shingled Magnetic Recording) technology will be used in archive, backup, data lakes and repositories, WORM (Write Once Read Many) and object storage environments due to its performance characteristics. These workloads can tolerate lower performance thresholds, and SMR is well-suited for organizations with massive amounts of data that require a value-centric alternative to the current nearline offerings.

In many ways, SMR is a stopgap measure as we await the arrival of more advanced recording techniques, such as HAMR and BPM. SMR’s compatibility with emerging recording techniques and its low cost assures it will be used for many years, so it is important to understand the finer points of SMR deployment.

SMR HDDs from several vendors are already penetrating into enterprise applications through CSP (Cloud Service Providers) and hyperscale deployments, and will work downward into storage and server OEMs. SMR has pushed prices down to as low as three cents per useable GB, and there are already products geared for client applications as well, including new super-dense external storage offerings with up to 4TB of storage. Let's take a device-level look at the new technology before we move up the stack to the software layer.

Longitudinal Versus PMR

Longitudinal recording was the primary HDD recording method prior to PMR's arrival. Longitudinal recording aligned the poles of each magnetic element (bit) horizontally, and in contrast, PMR aligns the poles vertically. Aligning the poles vertically allows the use of magnetically stronger materials (higher coercivity) by routing the magnetic flux through an additional under-layer (see graphic).

Materials with higher coercivity are more thermally stable, which is important due to the superparamagnetic limit. This limit denotes that bits can be spontaneously de-magnetized by local thermal fluctuations if the magnetic regions of the bit are too small, resulting in data loss. The 'stronger' magnetic material allows manufacturers to continue shrinking the size of each bit, but it also has limits. PMR's maximum density is currently predicted to be roughly 1 Tbit/inch².

Shingled Magnetic Recording

There is essentially a brick wall preventing further shrinking of the bits unless the industry moves to materials with even higher coercivity. Unfortunately, even if vendors move to more robust magnetic materials they would still need to shrink the bits even more. Using smaller bits also requires smaller read/write mechanisms on the disk heads, and therein lies the crux of the issue. Today's reader and writer elements have reached their physical limitations, and without a radical new breakthrough they cannot be shrunk in a cost-effective manner.

With that avenue closed, the next task falls to examining the current structure of the drive to see if there is a method of utilizing existing technology more efficiently. Magnetic bits are laid out into parallel concentric tracks on the surface of the platter, and today's tracks are in the area of 75 nanometers wide — or smaller than the flu virus. Conventional recording methods have a small space between the tracks to reduce interference from adjacent tracks when reading data back. Miniaturizing the read/write elements also allowed manufacturers to progressively place the tracks closer together. Advances in head technology and signal processing enable the read elements to pick out data in spite of the interference from adjacent tracks, and TDMR (Two Dimensional Magnetic Recording) is also used to increase accuracy.

Note that the reader head width (marked with a green stripe) is smaller than the writer head width (grey) in the graphic below.

SMR overlays the wider "write" portion of the tracks (much like the shingles on a roof), leaving only the read track exposed. The smaller read head can come back and read the data with ease. This technique is desirable to HDD vendors because it utilizes existing head designs, which helps minimize manufacturing and R&D costs. Pairing SMR with TDMR, which uses simultaneous incoming signals from two read elements to mathematically eliminate read errors, will also help to increase density.

Track shingling increases the TPI (Tracks Per Inch²) dramatically. Seagate's initial offering increases capacity by 25 percent. There are predictions of as high as four trillion bits per square inch, which can enable 20TB HDDs, and perhaps beyond, in the future.

The only drawback is that rewriting data, or writing random data, becomes a significant challenge because the tracks with existing data are already overlaid with other tracks. Let's take a closer look at how engineers are addressing this problem on the following page.

Performance Challenges

The process of writing data to an SMR HDD is straightforward at first. The drive writes sequential data normally, and the tracks are overlapped (shingled) during the initial write process. However, if new (or modified) data needs to be placed near existing data, any modification of the existing sectors will overwrite the neighboring shingled tracks. Thus, an overwrite triggers a read-modify-write procedure due to the inevitable impact on the adjacent overlapped tracks. Unfortunately, random write activity also triggers this process, because any data dropped into a sector at random will affect overlapping tracks.

Image Courtesy of SeagateImage Courtesy of Seagate

This overwrite process begins with the HDD reading all data that will be overwritten, then the data in the following sectors/tracks that are affected, and moving it to a cache. In the case of Seagate SMR drives, the cache consists of a section of the platter (rumored to be 20GB) used as persistent storage. The drive uses the persistent cache location on the platters to hold data as it reorganizes it. Once the data is reorganized, the drive rewrites it sequentially back to the home location.

An SMR HDD also utilizes DRAM caching to aggregate and sequentialize incoming random write data before writing it to the platters. The drive also uses the cache section of the platter to provide persistence for the data held in the volatile DRAM cache. This dual-purpose section of the platter can utilize SMR, or standard PMR to speed up operations — vendors are not publicly divulging which method they use. The limited size of the platter-based cache speeds data operations because it reduces head movement as it transfers data into, and out of, the cache. The persistent cache can also theoretically be NAND flash-based, and it wouldn't be surprising to see that technique come to market to help address the inherent performance limitations associated with SMR.

Image Courtesy of SeagateImage Courtesy of Seagate

The drive rewrites modified sectors and tracks when it returns the data from the cache to the home location. This process would require an entire re-write of the HDD (from the affected track) for every re-write operation if the drive were shingled end-to-end. The drive has tracks that are grouped into bands to avoid this scenario, and any track modification only requires reorganization to the end of the band.

HGST is utilizing 256MB bands in its inaugural offerings. Seagate indicates that the sizes of the bands are adjustable for custom drive workloads and applications. These adjustments will likely result in drives developed specifically for targeted use-cases as the technology progresses.

SMR's overlapped track structure is optimized for large sequential read/write operations, but the technique will struggle with any type of random write data.

To accomplish the complex data placement tasks, SMR HDDs remap the LBA tables. The graphic above is an example of the modified LBA mapping techniques utilized on an SMR drive. Even if the host writes data to the drive sequentially, the drive may not initially write the data in a contiguous fashion due to the LBA table remapping. This means that some portions of files that would normally be concentrated on the outer edge of the platter, for instance, could possibly end up on other portions of the platter as well, which will can affect performance consistency. This brings forth new challenges and an additional layer of complexity (in comparison to SSDs) that must be overcome to keep performance within expectations. 

The read-write-modify process is very similar to how SSDs utilize a flash translation layer in tandem with overprovisioning, and it incurs a heavy performance penalty in some scenarios. Many of the engineering techniques utilized to abstract the LBA space, such as using dynamic LBA mapping in favor of static mapping, come from established SSD technology. It is ideal if this process of sequentializing incoming random data, and the read-write-modify process, happens at the software layer (as explained on the following page). 

Clever management techniques can reduce the performance penalty incurred by SMR technology. Each of the three emerging techniques has specific pros and cons that have a direct impact on performance and compatibility.

Drive-Managed (Autonomous) SMR

Drive-managed SMR provides the easiest deployment path simply because the drive is backwards compatible with all existing hardware, applications, filesystems and operating systems. The HDD exposes itself to the host system as a standard HDD, and the host has no knowledge that the underlying media employs SMR technology. Drive-managed SMR leans heavily upon the technologies pioneered with SSD garbage collection and translation layers, and it also features overprovisioning areas to assist in drive management operations.

The drive handles all data requests internally, which creates unpredictable performance degradation under some conditions. This is a negative aspect, but the cost of this type of implementation is a crucial component. Many customers will tolerate the slow and unpredictable performance in exchange for the incredible economics, and thus we have already seen this method penetrate the client space as well.

Host-Managed (Restrictive) SMR

Host-Managed SMR is not backwards compatible with existing host systems and requires changes to hardware, filesystems, applications and operating systems. This technique provides the host system with knowledge of the underlying media and employs new commands that are under development by the T10/T13 standards committees. SCSI and SATA use ZBC (Zoned Block Commands) and ZAC (Zoned ATA Commands), respectively, to steer data directly to the various zones on an SMR HDD. The drive will reject the request if the host does not send SMR-tuned requests. New SAS and SATA HBAs are working their way to market that support ZBC and ZAC commands. 

Both Host-Managed and Host-Aware SMR drives can optionally utilize distinct zones for different types of data, such as random write zones (conventional zones) and sequential write zones (write pointer zones). This technique allows the host to send different data patterns to each respective zone, which provides the most predictable and best performance overall. All host data traffic will be sequentialized and written to the sequential zones under ideal conditions, but key portions of data (such as metadata) can be written to random zones in order to speed data access.

Host-Managed SMR is the most difficult method to implement, but the performance advantages make it an attractive proposition.

Host-Aware (Cooperatively-Managed) SMR

Host-Aware SMR attempts to merge the best of both worlds. This method leverages the new command sets and is also backwards compatible with existing hardware, filesystems, applications and operating systems. SMR-optimized ZBC and ZAC commands execute as expected and provide the best performance. However, sub-optimal commands (normal commands) will also work, albeit at the expense of performance and predictability.

Conclusion

SMR HDDs provide unheard of pricing, for instance, Seagate's offerings debuted at three cents per gigabyte. However, SMR also brings forth a slew of new challenges. Mitigating the performance impact associated with SMR is a technological hurdle best jumped at the application/filesystem/OS level, and the relatively quick support from the T10/T13 committees is encouraging. The Linux community is already embracing SMR on several fronts as well. The libzbc library added ZBC support in Linux kernel 3.17, and ZAC in kernel 3.19.

In contrast, the same type of support can revolutionize how we use SSDs as well, but there have not been any significant development efforts from industry committees to optimize at the filesystem/OS level to allow for more granular control of flash-based devices. Pundits have spilled a million words that declare that SSDs will replace HDDs, but the quick adoption of SMR standards is yet another sign of the HDDs power in the storage realm.

Paul Alcorn is a Contributing Editor for Tom's IT Pro, covering Storage. Follow him on Twitter and on Google+.

Follow Tom's IT Pro on Twitter, Facebook, LinkedIn and Google+.