Recently I talked to a service provider about metered billing of their hosted IT infrastructure. Each month they send a bill to all tenants using the infrastructure. Flash technology (solid state disk) is billed at a higher rate than spinning disk.
Inevitably every customer receives their first (and more expensive) bill detailing flash usage.
A call into the service provider results in the explanation that the I/O pattern of their application triggered movement of their data onto a flash tier (e.g. via the VMAX FAST algorithms). This type of movement can be recorded at a level of granularity that makes it possible to generate a bill.
The whole "cost of an I/O" discussion has actually rippled its way down into the caching and flash implementation inside VNX's MCx architecture. In a previous post about MCx I introduced the picture below and spent most of my time talking about the MCR layer:
In this post I'd like to highlight some of the work that the MCx team has done to leverage flash. The first thing worth pointing out is that the previous VNX architecture positioned the flash layer (now known as MCF) on top of the cache layer. This non-intuitive approach was a result of the historical co-location of the caching and RAID layers within one driver. Separating these two entities would not happen overnight. The work required to separate these two layers evolved over several years. Why? Because it is always challenging to change the engine while the car is running.
The MCX release marked the fruition of (a) the cache and RAID separation, and (b) the ability for the cache to more fully leverage multi-core.
As a result, the flash layer (MCF) could assume its logical place as a driver in between the caching and RAID drivers.
What are the highlights of MCC and MCF? For starters, I found that the cache destage algorithms have grown in sophistication in regards to the "cost of access" described above. When I worked on the cache years ago the flushing algorithms were primarily LRU-based. LRU measures recency indicated by position. Where LRU fails is that it will evict an item on the end of the LRU with a new access that is more recent but not more frequent.
MCx architect Dan Cummins explains the enhancements to LRU as well as the cost-based approach:
New algorithms have been created to overcome any LRU pitfalls by taking into account both recency and frequency. Examples are LIRS, Multi-Queue, ARC, etc. They use ghost caches to monitor frequency of access and filter out accesses that are not more recent and frequent than what is already in the cache. The new cache design also employs an adaptive caching algorithm that is further optimized for cost of access. This new design leverages a multi-queue approach to determine which parts of the workload are the most important to keep based on recency, frequency, and cost of access. The recency and frequency filtering enable MCC to dynamically respond to changes in the application workloads. MCC now uses a third component: the cost of storage media access. This further classifies and determines how long to age data out of the cache. The result of all three inputs results in more intelligent caching and increased hit rates in the face of dynamically changing workloads. The bottom line is that it all leads to increased IOPS and lower latency.
Dan also mentioned additional MCC changes that increase IOPS:
- There are no longer separate read and write caches.
- The amount of cache made available to a particular LUN is dynamic.
- MCC implements a true Active/Active design. It has the capability to field requests to the same device from any storage processor without having to redirect.
- MCC has the capability to ratchet down the consumption of cache pages that are being consumed by slow devices on the backend (to ensure that sufficient pages are available and that the performance of other devices are not impacted).
- MCC is highly optimized for access to backend devices, employing techniques to increase HDD throughput by minimizing head movement.
- MCC amortizes seek overhead to improve cache effectiveness.
Regarding the new position of the MCF driver, MCC understands the underlying random access capabilities of MCF and will avoid or minimize using any HDD algorithms against the MCF tier (e.g. prefetching and seek sorting). MCC takes also takes advantage of the true random access performance and drives deeper queue depths to keep more FLASH channels busy to drive more throughput.
Dan shed further light on the re-positioning, which has resulted in the term "FAST Cache" being replaced with "MCF".
Positioning the FAST Cache technology below the DRAM cache improved IO flow and system IOPS/response times for cache hits. Since MCR and MCC are true Active/Active designs, and since FAST Cache leverages these subsystems to move data between HDD and Flash, the stack re-ordering implicitly enabled an Active/Active capability for FAST Cache. And finally, multi-core improvements were made to FAST Cache to enable it to scale across cores and sockets. The resulting technology was adopted as MCF (Multi-Core FAST Cache).
This concludes taking a look under the hood at the new MCx architecture. In addition to this post (which highlights the MCC and MCF layers), the following two posts serve as a great reference for summing up the full stack:
The MCR and back-end layer (MCx: Affine Way to Core Affinity)
The quality framework for MCx (Engine Swap)
Steve
EMC Fellow
Comments