The slide below is copied from a presentation that I gave at an IT@Cork function in February of this year:
The slide is meant to communicate a process whereby a data center architecture can move towards an application/data association framework that facilitates data valuation activity. The key to building the framework is the performing of a continual application inventory. In previous posts I've highlighted the first two phases of this inventory process:
- Phase 1 is characterized by identifying applications that can be decommissioned. Once identified, the journey to valuation begins by sunsetting the apps to an archival tier. This process is also accompanied by recording corporate policies about the application, and the corresponding new location of the data, into a repository known as a Metadata Lake.
- Phase 2 begins when the decommissioning process is complete. The next wave of application inventory identifies the "low-hanging" fruit for moving inventoried applications onto an infrastructure that possesses the appropriate level of trust (based on corporate policies for that application). Once again the policies and the location of the data are recorded in the same Metadata Lake created in Phase 1.
Phase 3 represents the cessation of the manual placement process that typified phases 1 and 2. The inventory process for legacy applications reaches an end. As new applications are introduced into the framework, automated placement and auditable recording of placement choices are stored into the Metadata Lake. In order for this automation to happen, the data center team has to introduce three new features into the stack. These features would ideally be implemented and in place by the end of Phase 2:
- The storage layer exports a common trust taxonomy across all storage devices.
- The software defined data center layer gathers trust taxonomy data from all devices (e.g. compute, network, storage), and exports the aggregate as a set of trust services.
- The application deployment layer (e.g. CloudFoundry) adds Governed Placement Service logic to it's application deployment algorithms.
Once all of these features are in place, one can trace the deployment via the interplay between the placement logic and the Metadata Lake. A diagram highlighting this is depicted below.
Once this framework is fully in place, it also needs to automate and capture future application retirement and/or application/data migration activity.
The Metadata Lake then becomes a Data Audit and Inventory System for use by internal data science teams, Chief Data Officers, Chief Compliance or Legal Officers, etc. It also becomes a repository that can be used to respond to external regulators that ask for proof of compliance to local, national, or international law.
With the outline of this framework in place, I'd like to begin taking a look at data valuation processes and definitions. I'll begin describing these in future posts.