This post is the third in a series of insights shared at Evanta's Global CIO Executive Summit at the Skytop Lodge in Pennsylvania last week. The insights were first shared during a keynote introducing Data's Economic Value in the Age of Digital Business. During the talk I presented five key insights about data value. In my first blog post I recommended that CIOs insert themselves into the conversation about data value. In the second blog post I went on to say that there are two primary IT touch points for implementing valuation algorithms: data usage and data ingest. In today's post I'd like to build on this second insight by sharing a design approach that can be used to implement valuation for the data usage and data ingest use cases.
Third Key Insight: Combine Business & Technical Metadata
In my last post I shared the following:
"Data scientists are often exploring data in the context of a business problem. This offers the opportunity to join business context together with data. The joining together of business and technical metadata maps nicely to CIO Insight #1: joining the discussion about data's value to the business."
Our research into data's value caused us to look internally at our own corporate data science initiatives to understand how our internal analytics projects were being tied to business value, and then associating that value back to the data sets and models. We discovered the work being driven by EMC's Chief Data Governance Officer, Barbara Latulippe. Barbara had been working with EMC's IT organization to build a new style of data governance framework that supported the valuation of data. This diagram is superimposed on an image of Barbara presenting her work during the Chief Data Officer and Information Quality Symposium at MIT this past July.
We discovered that an architecture had been created that accomplished the following:
- Attivio has the ability to look down onto a repository (e.g. a Data Lake) and discover the technical metadata (e.g. schemas) that are available in the lake.
- Collibra has the ability to track the business context in which the data is being used, e.g. as it is being ingested into the data lake, or as it is being used in data science activities.
- The Spring layer has the ability to glue together the technical and business metadata into one "catalog".
The joining together of business and technical metadata allowed employees to "pipeline" data from raw ingest, to data innovation and consumption, to business insight, and all the way to the creation of one of EMC's first data services: MyService360. The diagram below highlights that as the governance (g) of metadata enrichment increased (the X-axis), the value (v) of the data increased as well (the Y-axis).
The ability of this framework to deliver a new data service was a turning point for our data value research. We began the research by hearing from our customers that they were struggling to create new data products and services. Internally we discovered that a team had successfully traveled down the path of data service creation. The key learning was the creation of a framework that combined business and technical metadata together.
The next phase of our research was to determine whether or not this key learning could be packaged up and delivered to the market at large.
The answer is yes, and in Insight 4 of 5 I will highlight new capabilities that can be leveraged in the fourth insight for the CIO community.
Fellow, Dell Technologies