In 2014 EMC partnered with Capgemini to conduct an industry survey on Big Data. Buried deep within the Big Data Report was the following quote:
“Among our respondents, 63% consider that the monetization of data could eventually become as valuable to their organizations as their existing products and services”.
In other words, companies that generate revenue from selling products and services strongly believed that this revenue must be augmented with new sources of value coming from data products and services. In order to validate and respond to the perceived needs of Dell EMC’s customers, the Office of the CTO launched a “Data Valuation” research initiative with Dr. Jim Short of the San Diego Supercomputer Center.
As this partnership enters its third year, the need for data valuation has been well documented. EMC began releasing data products and services for the first time. Data valuation services and software solutions are now available to customers. These services and solutions allow data center operators to build a “digital bank”. A digital bank is defined as a local repository (typically a Data Lake) with a catalog of data assets that have been annotated with statements of business value.
The ability to create and operate a digital bank has the following advantages:
- Buying and selling data sets becomes more streamlined (the value of locally-owned assets is known)
- Identifying monetizable data assets becomes easier (high-value assets can be easily found)
- Improved data quality leads to high data value, which leads to new data products and services
- The value of analytic models can be ascertained, enabling ROI metrics for data science initiatives
- Data value can now be used as an input to data management
Every one of these benefits addresses the concerns that inspired the research. Capgemini survey respondents were generally aware of the need to increase profitability by leveraging data. New services and solutions are now enabling the creation of data valuation business processes on top of IT-enabled data catalogs. Analysts from Gartner and Wikibon are publishing related research findings.
The concept of digital banks, however, is new. Early movers face the challenge of extending the local digital bank concept to a federated (digital trust) model. How can geographically dispersed data assets, located within the distributed confines of multiple cloud providers, be protected, managed, and leveraged in the same way as privately controlled assets? This question can best be understood by highlighting example cloud provider use cases for data:
- Google cloud: for data sets needing big data analytic capability
- Azure cloud: for data set sharing
- Amazon cloud: for integration of batch and real-time processing
- Private cloud: for high-value data assets that bring maximum business returns (crown jewels)
A trust is defined as a collection of assets managed by an owner for the benefit of another. In the context of data valuation, which vendor(s) can create and manage the internet-scale digital trust depicted below? How can the enterprise of the future extract maximum business value out of the entirety of their distributed data portfolio?
One answer is to create a digital trust architecture. I will describe this architecture in upcoming posts.
Fellow, Dell Technologies