« Where's The Data (Value)? | Main | Mobile World Changes »

February 26, 2015

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Scott Lee

Hi, Steve! Metadata management for Lake architecture is very top-of-mind for me and the rest of the EMC² Global Services Transformation work-group.

Our conceptualization is for a Data Catalog, which would describe every individual [published and public] Data Asset in the Lake, where that little word 'in' can be a bit misleading since the Data Lake is a federated architecture and the actual bytes me be physically or logically resident elsewhere. In that case, the Data Catalog represents a link or "external" table to the federated Data Asset.

But the brilliant part is the consumer does not have to care about any of that - the Lake is location and source technology agnostic.

Getting to the Catalog entries - you've raised an interesting use case for leveraging metadata at the Data Asset level. But we believe that the first use case most customers will want is Data Provisioning, i.e. giving an analyst or data scientist access to a piece of information so that they can create value with it.

There is a lot more I think we could talk about - let's connect off-blog and I'd love to hear your thoughts on what we've drawn up and prototyped so far. Cheers!

The comments to this entry are closed.

Blog powered by Typepad