In a world where information is viewed as an asset on equal footing with other assets (e.g. people, automobiles, cash, etc), what is the ideal platform for tracking, capturing, and measuring data's value?
As Wikibon analyst George Gilbert pointed out in our interview with Doug Laney last month, treating data as an asset is difficult because...
"our systems weren't designed to track it, capture it, and measure it"
I believe that creating an ideal platform is made more complex by the existence of the legacy platforms referred to by George. We in the industry should take a good hard look at defining an ideal platform for data as an asset and then project this vision onto our existing systems.
Fortunately, Wikibon has already begun to publish their research findings on a Digital Business Platform (DBP) that treats data as a capital asset.
I've spent some time studying their findings and have compared these findings to the DBP that we're building internally (under the direction of EMC's Chief Data Governance Officer). It is probably too early to begin referring to EMC's internal data governance platform as a DBP (as defined by Wikibon), but I am finding a lot of similarities. One of the early success stories of EMC's DBP is the creation of a new data service known as MyService360 (Chad has written a good description of MyService360 here).
A Digital Business Platform, as defined by Wikibon, has five different classes of capabilities:
Each of the capabilities listed above is involved in the overall task of taking incoming data and performing the necessary digital conversion that enables the data to be "put to work" (in the same way that humans, cars, and money is put to work). Data Feedback Loops are central in the architecture because they essentially are responsible for creating rich metadata that tracks the conversion of "analog" data into new formats that can benefit the business financially. EMC's internal DBP is in part a combination of our Business Data Lake solution and a governance framework that sits alongside of it.
The Attivio layer scans the lake to create rich technical metadata. The Collibra governance layer creates additional rich metadata that combines business context with the technical metadata. The SpringXD layer contains functionality created by EMC that serves to unite these two layers and join together data with its appropriate business context. One of the benefits of this approach is that it keeps track of the lineage of data transformation. The diagram below is an example of how raw, source data (bottom row) is converted into intermediate "driver" data (middle rows) to product business artifacts (end user assets).
This graph has similarity to the "Data Feedback Loops" highlighted by the Wikibon research, and can enable the surrounding capabilities in the ecosystem. If each component in the Digital Business Platform has programmatic access to this type of graph, the possibilities for actually calculating the true business value of each piece of data (source, driver, end user) become much higher.
Of course, EMC has a form of DBP for sale: the Data Lake. Many of the governance capabilities described above have been moved directly into the newest data lake solution being shipped later this year. In future posts I hope to dive more deeply into the Wikibon research on Digital Business Platforms and compare and contrast their research with delivered solutions such as the one implemented by Lightstorm.
Steve
EMC Fellow
Comments
You can follow this conversation by subscribing to the comment feed for this post.