I continue to lead a Data Science project at EMC that focuses on Innovation Analytics. My approach has been to create a set of hypotheses and leverage a task force of volunteer data scientists to help me prove or disprove them.
During this process I ran into a roadblock on one hypothesis:
Hypothesis #7 Incubation Lineage and Asset Generation
I believe that the path that knowledge takes, from a local innovator, to a corporate boundary spanner, to an implementation team, to a delivered asset, can be traced and measured. I also believe that this measurement, once studied, can reveal ways to accelerate innovation and point out areas of knowledge that are yet to be converted. I've long been a fan of provenance, and I love the concept of "idea lineage". The lineage can be studied to reduce asset delivery time.
IH7a: Frequent knowledge expansion and transfer events reduce the amount of time it takes to generate a corporate asset from an idea.
IH7b: Lineage maps can reveal when knowledge expansion and transfer did not (or has not) result(ed) in a corporate asset.
The roadblock occurred because my data sets did not contain sufficient data to measure “elapsed time” between events. My mentor in this initiative, David Dietrich, suggested that I begin a longitudinal study. In a previous blog post this year I described his advice as follows:
Our ability to prove hypothesis #7 was in jeopardy. This realization was not the end of the world. In data scientist terms, it was time to begin a longitudinal study (making a series of observations over a long period of time).The team began to design a method whereby TRL (Technology Readiness) levels would be gathered and recorded as a regular part of the reporting and gathering of global innovation activities. Over time, we would eventually have enough data to take a good, hard look at our hypothesis
After writing the blog post I commissioned the longitudinal study by collaborating with EMC Labs China researcher Diego Wu. Diego and I added two steps to our data gathering process:
- We introduced innovation lineage threads whenever new, significant research initiatives started (e.g. a university research collaboration was funded).
- We gathered innovation activities in the same way, but added these activities to a lineage thread when appropriate.
Diego implemented this framework, and two key insights emerged.
Asset Generation (or lack thereof) is Measurable
The diagram below is an actual record of an interaction that EMC had with the University of Florida. An innovation lineage thread was created (the U Florida lineage thread). As multiple innovation events were entered into the innovation analytics framework, they were also associated with the U Florida thread.
In this example, several assets emerged from the initial visit of a professor to EMC’s campus. These assets include the publication of a paper (good press for EMC), the hiring of a student (who was already up to speed on the EMC domain to which they were assigned), and the introduction of a new software feature into an existing product line. Each event occurred at a discrete point in time that is a measurable distance away from any other event in the lineage thread. In this case, three assets were generated within a six month period. The time lapse becomes a data point that can be compared with other lineage threads to evaluate the “time it takes to generate an asset from an idea” (the original hypothesis).
Not all ideas experience this level of success. Often times no significant corporate asset is produced. The use of innovation lineage threads allows us to calculate “asset generation success ratios” across all ideas registered in the system. Success can be correlated against many different factors, including geography, the people associated with the idea, the business unit they are part of, amount of investment, etc.
While studying any given lineage thread the level of engagement becomes obvious. For example, with the University of Florida I could see clear examples of employee and faculty engagement. They visited each other. They worked on a paper together. They attended a conference together. I could “count” the number of such entries in the database and calculate such metrics as “average number of engagements per month”.
This enables me to draw “heat maps” by asking, for example, which universities are sufficiently “engaged” (or which are not).
As the Director of Global Innovation I struggle to keep track of the health of global innovation initiatives at EMC. For example:
- EMC has dozens of university research partners around the world. How engaged are we with each one?
- EMC has committed to incubate nearly 30 winning ideas from the 2012 Innovation Showcase process. Which ones are still alive? Which ones have stalled? What kind of corporate assets result from EMC’s yearly investment in the Showcase?
- How can we fail fast? Which funded innovation initiatives are not likely to succeed and should experience “de-investment”?
The decision to begin a longitudinal study was the right one. I’m also convinced that using the Data Analytics Lifecycle (shown below, and taught in EMC’s Data Scientist Course) was the right idea. The project will most certainly progress to Step 6, causing the EMC Executive team to make operational changes to our innovation frameworks.
If you have experience with a concept similar to innovation lineage threads, I'd be interested to hear more about your approach.