In previous posts I have described how application workloads drove innovation in the areas of block (SAN), file (NAS), and object(CAS) storage systems. I discussed the evolution of these systems in the context of the increased importance of metadata, and used the diagram below to highlight the different roles these three systems take when it comes to metadata management.
In my last post I claimed that new forms of metadata would challenge the underlying infrastructure even further (and result in new forms of innovation).
For this post, I'd like to spend a bit more time describing one of the main areas where evolution in metadata expansion occured: infrastructure-based workloads.
A wide variety of infrastructure-related applications (IRAs) began to surface as a result of the proliferation of block, file, and object storage systems within a data center. These IRAs, like their "traditional" application counterparts, had a desire to read and write their own "IRA data" to a robust and performance storage system. This "IRA data" can be thought of as metadata falling into one or more of four categories:
- Configuration Management.
- Data Center Security.
- Backup and Recovery.
- Content Management.
In order to more fully understand the impact of these applications within the data center, it helps to extend our diagram above to include these new applications alongside the "big 3" storage systems.
Infrastructure-based metadata, in general, has much more complex inter-dependencies than the more traditional content-based metadata. While content-based metadata often has a cardinality of one, infrastructure-based metadata can map to multiple pieces of content and multiple pieces of other infrastructure-based metadata. Consider the diagram below:
Traditional application metadata might contain metadata which enriches other pieces of content. In this example, patient metadata further describes content stored in an X-RAY, electronic medical record, and doctor's notes. The cardinality of the relationship of application metadata is a "zero or more relationship".
Infrastructure-based metadata, on the other hand, has much more complex cardinality. Consider backup metadata. This form of metadata not only has to maintain cardinality with metadata and content, but it also has to consider other infrastructure-based metadata, such as information about the data center infrastructure (where is my backup device), security-based metadata (who is allowed to access), and/or compliance metadata (is there an audit process or workflow going on)?
Each type of infrastructure-based metadata has this type of complex cardinality of zero or more.
It's this exact issue that caused a tremendous surge of innovation, and I will highlight some of these innovations in future posts.