Skip to main content

From OpenChannelSSD to ZNS

No one has updated the Open-Channel SSD open source project on GitHub for a long time. It has been improved on the platform built by qemu, and bugs often appear. The requirements for the kernel version, qemu version, and system version are quite high.
Although many papers have been published, I really feel that many of them are theoretical hypotheses and cannot really get the corresponding experimental results. Therefore, you must read the top meeting papers in terms of learning. After all, Ali’s internal technology is not public. of.
To achieve cloud storage efficiency at the enterprise level today, a single SSD needs to meet many different workloads, and workloads can now be said to be ubiquitous. When using shared SSDs, interference between loads causes delays to rise and fall, and the worst delay increases dramatically. Only by ensuring stable service quality for every hard disk user can the service quality of the cloud environment be reflected.

Traditional SSD handing the internal FTL to the host for processing is the main function of Open-Channel SSD, allowing users to make their own SSD.
Open-Channel SSD puts forward the concept of chunk and PU.
Chunks features:
Write sequentially within the LBA range;
Need to reset to rewrite;
Use HDD's SMR specification (ZAC / ZBC);
Optimized for SSD physical limitations: align writes with media
Parallel Units features:
The host can perform direct I/O on individual workloads;
Single or multiple dies to achieve striping;
The parallel unit inherits the throughput and delay characteristics of the underlying medium;
A concept similar to I/O determinism in NVMe;
It is not difficult to see that Open-Channel SSD achieves the characteristics of I/O separation and predictable delay. The FTL function is moved to the Host to be responsible for data management and I/O scheduling.
But in reality, the Open-Channel Specification only defines the most common parts of Open-Channel. The characteristics of SSD products from different vendors are different, and they may be difficult to unify. The requirements for customized applications and workloads still lack flexibility. About Zoned Namespaces (ZNS)
The Open-Channel SSD architecture is adopted by Ali, Microsoft, etc. The concept of making this architecture a part of the NVMe standard specification and providing flexible customization requirements is a hot research topic. Western Storage drives the function into NVMe that solves key OCSSD use cases, and puts forward the concept of ZnS.
It is a technical proposal in the NVMe working group
Compared with the normal NVMe Namespace, Zoned Namespace divides the logical address space of a Namespace into zones. The basic operations of Zone include Read, Append Write, Zone Management and Get Log Page.
The zone interface is standardized to:
Reduce WAF on the device side;
Reduce OP;
Reduce SSD DRAM, which is the most expensive part of SSD;
Improve latency and throughput;
Applicable software ecosystem
How to understand?
ZNS is similar to SMR's ZBC/ZAC
Storage space is divided into multiple zones
Each zone is written sequentially
It is an interface optimized for SSD
Consistent with media characteristics (Zone size is the same as Nand block size, Zone capacity is the same as media size) Reduce NAND media erasing cycle
Information about Zone:
Zone state transition
Empty, Implicitly Opened, Explicitly Opened, Closed, Full, Read Only, OfflineEmpty -> Open -> Full -> Empty -> ….
Zone Reset
Full -> Empty
Zone size and zone capacity
Zone size is fixed
Zone capacity is the writable area in a zone
Compared with Open-Channel, the biggest difference is that in Zoned Namespace, the zone address is LBA (Logical Block Address). Zone Namespace can avoid all kinds of cumbersome address conversions in Open-Channel.
Considering the low scalability of multiple writes to a zone (as shown in the figure below), ZAC/ZBC requires a strict write sequence. Limiting write performance will also increase the overhead of the host. Therefore, the software ecosystem, HBA and others are facing huge challenges.
Therefore, an additional zone (Zone Append) is introduced to append data to a zone without defining an offset, and the driver returns the location where the data is written into the zone.
Zone Write example:
3x Writes (4K, 8K, 16K) – Queue Depth = 1
Zone Append example:
3x Writes (4K, 8K, 16K) – Queue Depth = 3
ZNS refers to Synergy with ZAC/ZBC software ecosystem. Existing ZAC/ZBC-aware file system and device mapping are "working", supporting ZNS requires very few changes; reuse existing work that has been applied to ZAC/ZBC hard drives (SMR); directly integrate with the file system (There is no host-side FTL; a 1TB media device does not require 1GB of DRAM; it can make better use of SSD); the code has been used in the production of major manufacturers and can be used in the Linux ecosystem.
ZNS uses the existing storage stack:
Userspace library
Libzbd
Name-cli
Blktests
Util-Linux (blkzone)
fio
libzns
Kernel space library
NVMe support for Zones
XFS, Btrfs, F2FS, dm-zoned, etc…
Qemu with ZNS support
ZNS is the basis for satisfying the QoS and Latency requirements of most applications, but it is not as flexible as Open-Channel.

Comments

Popular posts from this blog

SK hynix PE6011 Enterprise SSD Review

We last discussed SK hynix back in August as we looked into the history of the company and overviewed their Enterprise SSD solutions. Today we are reviewing and looking at the performance of the SK hynix Enterprise SSD the PE6011. The PE6011 features a U.2 7mm form factor, 3D TLC NAND, PCIe NVMe interface, and capacities up to 7.68 TB. It is ideal for read-intensive workloads and light write usage. What’s unique about this product is that from conceptualization to manufacturing everything is done in-house by SK hynix. This product is aimed at those looking for an economical PCIe solution for the datacenter environment. Looking at the design and build of this drive, the entirety of the casing is a polished silver. It is a 2.5″ drive with a U.2 connector and a form factor z-height of 7mm. Being 7mm lends a physically smaller footprint and gives this drive the ability to be equipped and fit in a large variety of things for universal appeal. Branding as well as unique identifying informa...

Kingston DC1000B SSD Review

The Kingston DC1000B is a read-focused M.2 NVMe SSD aimed at the market of on-board server boot drives. While the DC1000B is aimed at being a cost effective offering, it doesn’t skimp on features such as power-fail protection that buyers expect from enterprise-grade SSDs. Aside from being used as a boot SSD, the DC1000B is beefy enough and can be leveraged for caching and logging applications as well, with a 0.5 DWPD endurance rating and a 5 year warranty. The DC1000B ships in capacities of 240GB and 480GB. The DC1000B leverages 3D TLC NAND and the PCIe Gen3 x 4 NVMe interface to offer sequential read speeds topping 3.2GB/s. According to Kingston, the drive can hit up to 205K IOPS in 4K steady-state read at an average latency of 161?s. On the write side of performance, things are a bit lower, but this is expected. Top sequential write speed is 565MB/s, top steady-state 4K is 20K IOPS, and latency is only 75?s. In other highlights, the drive supports SED with AES 256-bit encryption. ...

Memblaze PBlaze5 920 Series NVMe SSD Review

We’ve seen numerous enterprise SSDs from Memblaze over the years, they’re often on the leading edge when it comes to both technology and performance. Recently they’ve launched a new set of SSDs in the Memblaze PBlaze5 family, the Memblaze PBlaze5 920 Series. The 916 Series before it, the PBlaze5 920 Series comes in U.2 and Add-in-Card (AIC) form factors. The biggest difference with the 920 Series is that it uses a new set of NAND, moving to 96-layer 3D TLC NAND from 64-layer in the prior model. At the top end, the new Memblaze SSDs are quoted to deliver 5.9GB/s and 970,000 IOPS at the top end of the performance spectrum. Memblaze PBlaze5 920 926 The 920 Series comes in two endurance ratings, either 1 drive write per day (DWPD) or 3 DWPD. This creates four distinct drives, segmented on form factor and endurance. The AIC form factors are designated as C920 and C926, with the C920 being the more read the centric drive and the C926 carrying that 3 DWPD endurance rating. Similarly, the U....