This post is the fourth in a series of data valuation insights shared at Evanta's Global CIO Executive Summit at the Skytop Lodge in Pennsylvania last week. Each insight was shared during a keynote introducing Data's Economic Value in the Age of Digital Business. The blog posts represent my intentions to socialize these insights more broadly. In my first blog post I recommended that CIOs insert themselves into the conversation about data value. In the second blog post I went on to say that there are two primary IT touch points for implementing valuation algorithms: data usage and data ingest. In my third post I built upon on the second insight by sharing that these touch points should strive to combine business and technical metadata into one repository. I also highlighted that my own company (Dell EMC) has implemented such an approach and successfully delivered new data services as a result.
In this post I'd like to discuss the fourth insight regarding annotation.
Fourth Key Insight: Annotate Data & Models with Valuation Metadata
The message here is straightforward:
- after the CIO has inserted him/herself into a corporate discussion on data value, and .....
- after the CIO has decided on the valuation algorithms that are appropriate for the business, and ....
- after the CIO has created the touch points for combining business and technical metadata, then.....
.... it is time provide the capability to annotate the value of data and models based on their relevance to the business.
The ability to annotate data based on its value sounds simple enough: call an API that associates a specific piece of metadata (e.g. BVI = 8, EVI = $100,000) with a data set or an analytic model.
However, the tricky part is to have this annotation function as part of an overall system that tracks data as a capital asset from cradle (e.g. ingest) to grave (e.g. deletion).
For this reason Dell EMC recently introduced a new module that can be embedded within our systems: the Analytic Insight Module (AIM). The diagram below highlights the creation of a catalog that can be used for valuation purposes.
At the heart of this diagram is the Data and Analytic Catalog (DAC). This component can serve as a catalog for every data set and every analytic model that is available for use within the system. Note that Attivio is used to scan an existing data lake and create the initial catalog by filling it with technical metadata.
I mentioned in CIO Insight #2 that ingest and data usage are the key IT touch points for valuation. The diagram above highlights that additional business/technical metadata can be created during ingest (by using technology from Zaloni). In addition, after data scientists perform their work in their own business context, they will "publish" their work results (analytic models and data sets) back into the data lake. This results in more business/technical metadata being added to the DAC.
The ability to associate metadata with data assets and models is the 4th piece of the puzzle for a full, system-wide implementation of data valuation. The remaining question pertains to when the actual association of value gets stored into the DAC.
I will outline the answer to this question in the Insight 5 of 5 on implementing a system for data valuation.
Fellow, Dell Technologies