EMC has a number of data scientists that are part of EMC Labs China. They have been experimenting with different analytics packages as part of our efforts to characterize and improve the creative culture across our global innovation ecosystem.
One of the tools which has yielded great insight is the Stanford Topic Model Toolbox, which is defined as follows:
The TMT brings topic modeling tools to social scientists and others who wish to perform analysis on datasets that have a substantial textual component
Our innovation database has ~6,000 employee ideas submitted over a five year time period as part of EMC's annual Innovation Showcase. In the context of our Showcase, topic modeling yields insight into:
- The underlying semantic structure of idea submissions.
- The overall abstract topics represented by the idea submissions.
Data Scientist Tao Chen (EMC Labs China) ran a topic modeling algorithm to divide ideas up into twenty-five different abstract topics. The chart below displays a "sum of documents" from 2011:
This chart essentially breaks down idea submissions into twenty-five different categories. From this chart it is clear that Topic 22 represents an area that generated the most ideas. The toolkit allows for a programmer to dive down into each topic and see the "word scoring" for that particular topic. Topic 22 has the following word breakdown:
The most common topic submitted by EMC employees was related to employee engagement, productivity, and benefits. In other words, EMC employees had quite a bit of interest in their relationship to their corporation.
As the Showcase organizer, I could potentially use this fact to generate more interest in the next Showcase (e.g. ask for more ideas that are specifically related to the topic of employee engagement, productivity, and benefits).
EMC is a high-tech company. What are the most relevant technical ideas coming from the employees? And how do they evolve year-over-year?
I will attempt to answer these questions (via topic modeling) in an upcoming post.
Steve
Twitter: @SteveTodd
Director, EMC Innovation Network
Comments
You can follow this conversation by subscribing to the comment feed for this post.