21 December 2016


Sell your Data Science Project using Data Visualization

Sell your Data Science Project using Data Visualization

The good, and not so old, saying “a picture is worth a thousand words” suggests that a complex idea or a concept that would take a lot of words to explain can be represented or conveyed in a single image. That sounds like a good idea and maybe we can use it to explain and sell data science projects.

How often do we run into blank stares or confused looks when we share the results of our data science projects, the models we created, and the potential impact it can have on the business? Too often, right?

Often it’s helpful to leverage tools the business is already familiar with. Excel, for example, is a good medium. But it also takes time and tedious work to get it to look good. And if you want any interactivity in Excel, you need to code.

A whole market segment evolved to help address tedious, time-consuming Excel work – Data Visualization tools. Data Visualization tools, like Qlik and Tableau, were built to turn tables of data into visual representations of that data, so that business people can see data in a more intuitive way. A set of good principals have been developed over the past centuries and data visualizations have evolved from pieces of art that took a lot of time to create to interactive exploration of multiple views of the data in seconds.

Let’s use data visualizations to convey and apply our data science into the business. Here are a couple of practical examples of how you can use RapidMiner with both Qlik and Tableau.

RapidMiner and Qlik Sense

In this first example, we’ve built a dashboard in Qlik Sense that provides a business-friendly clustering application. We’ve created a basic integration between RapidMiner AI Hub and Qlik that lets users put RapidMiner data science to work creating clusters and detecting outliers – and do so all through a Qlik dashboard.

In this illustration, the business user can drag a slider to select the number of desired outliers.  The outliers are then removed from the data and the remaining data is segmented into clusters. This is a very intuitive process for users – they can visually see which records are removed by the outlier detection and what the resulting clusters are:

What is going on behind the scenes?  We created our model in RapidMiner Studio and exposed it as a web service in RapidMiner Server (now RapidMiner AI Hub). We built the Qlik Sense dashboard so that when the user interacts with the slider, i.e., selects the input value for the model, Qlik Sense calls RapidMiner AI Hub and delivers that information. RapidMiner AI Hub runs the model with this input value and shares the result back with Qlik Sense, which, in turn, updates the dashboard. It’s really that easy.

Here’s another example applying the same concept:

The user selects a territory from the map in the bottom left corner. Qlik Sense then calls on RapidMiner, passing along the state identifiers. RapidMiner runs a predictive model on the selected states to find the main influence factors for churn. The results update the Qlik Sense dashboard showing the factors that influence churn in these states (yellow bars) vs. the general population (orange bars). This gives the business user machine-learning-based insights that show them where to focus activities in sales, marketing and customer success:

Here we have a business application, where the value of machine learning and the results of the particular models are visually displayed in software that the business users are very familiar with. The interactive nature of the Qlik Sense-RapidMiner application is so much more illustrative of the value than a static image or even an explanation of a table of characteristics for churn.

And the really nice thing is that this is not that hard to do! Take a look at the documented step-by-step instructions of how to do it for both Qlik Sense and QlikView.

RapidMiner and Tableau

If you use Tableau, it’s also possible to have this same type of bi-directional interaction with RapidMiner if you are using the new Tableau v10 release. We just tried it out and in this example, we created a decision tree model to identify fraudulent money transfer customers:

This Tableau visualization shows the Customer ID (far left), a number of influence factors, and the confidence level it is fraud (far right bar column that is sorted). What’s neat here is that we can show business users our model in action and what characteristics of the customer actually classify them as potentially committing fraud. In Tableau, business users can quickly sort and filter on any of the columns to see patterns, and that way, build their understanding and confidence in the model.

Another Tableau example, which does clustering based on inputs from the user, is shown in this short video. The business can “play” with the data right in a dashboard and get a feel for it, the model, and imagine its application.

From Selling to Believing

These illustrative, and even fully-deployable analytics applications, can tell a thousand words about the models you deliver to the stakeholders and business. When these things become more tangible to the business, they might get more excited about machine learning. And that makes your life so much easier! You start having more conversations about how to use models to improve the business instead of explaining what is happening in the modeling tool. Once business users get it, you can start implementing your results into business systems directly.  And then you are likely to earn even more support to automate things further.

Try it out by requesting a demo today!

Related Resources