Last week I outlined a set of five insights for a CIO to consider when building an IT environment that supports data valuation. These insights included:
- Insert yourself into the data value conversation.
- Focus on data workflow and ingest.
- Combine business and technical metadata.
- Annotate both data sets and analytic models.
- Build valuation business processes on top of these new IT valuation capabilities.
In this post I'd like to step through an industry use case to show how these five capabilities play together.
Consider any industry (e.g. oil and gas, energy distribution, finance, etc.) that struggles with monetizing data. Any company within any of those industries may struggle with...
- ...how much to pay for a data set
- ...how to identify data sets that can be monetized
- ...creating data products and services
- ...understanding ROI on their investment in analytic models
The starting point for helping these companies increase their data valuation fluency, according to Bill Schmarzo, is to first focus on the value of business decisions they wish to make. Consider the diagram below.
In this diagram we see Step 1: assign value to a business decision. At this point the business does not know which data sets could potentially contribute to the decision, and they also don't have any idea about the potential economic value of the data sets and analytic models within their organization. This lack of understanding about the economic value of their data and models is the cause of their struggles to leverage data as a capital asset.
Step 1 concludes that the business decision being made, for example, could result in 45 million dollars in additional revenue being brought into the company. We've highlighted in Step 1 that the CIO needs to be involved in this conversation and begin to update the IT infrastructure, with an initial focus on ingesting data into a data lake before tracking business usage. The diagram below highlights:
- the identification of assets (e.g. sensor data) that are relevant to the business decision.
- The ingestion of these assets into the lake.
- The creation of a catalogue (e.g. AIM's Data and Analytics Catalog, or DAC) that support annotation.
In step 3 we conduct the data science activity that will ultimately lead to the desired business outcome. The three data sets identified below become part of an analytic workspace, models are run, and new data sets are published, along with the analytic models that created them. These new data sets and models are also published into the lake, creating a lineage graph that accompanies the catalog.
The fourth and final step is to annotate the data sets and models with statements of value. Given that the CIO and/or a representative have been involved in the dialogue from the beginning, they understand which algorithms should be used to assign value. For example, the simplest algorithm would assign value as follows:
- Divide the actual/potential business value ($45m) by the number of data sets
- Divide the actual/potential business value ($45m) by the number of analytic models
Using these algorithms results in the following value assignment.
Using this approach the infrastructure has enabled the association of statements of value with data sets and analytic models. The business can continue to iterate on this model by applying data sets and models to different business decisions and continuing to annotate. Over time, as more and more data sets and models are annotated with value, the business develops the following capabilities:
- They have a much better sense of how much to pay when purchasing a data set
- They begin to understand the high-value data sets and models that could potentially be monetized
- They can focus on data quality initiatives for high-value data sets that could be offered as a service or product
- They can assess their investment in data science teams and processes and determine the effective ROI
This concludes the series of blog posts advising the CIO community on how to implement a data valuation framework.
For those CIOs or IT architects that are interested in continuing the discussion in a face-to-face setting, I recommend registration and attendance at the upcoming Data West winter forum in San Diego on December 13-14 2016.
Fellow, Dell Technologies