We recently announced a game-changing integration between Tableau and RapidMiner. In a previous blog post, we explored the impact of using these two leading platforms together would be for Tableau users. In this post, we wanted to talk about what the impact will be for RapidMiner users.
In short, using Tableau will make the models that you build more explainable and accessible by plugging them into a tool everyone understands. This makes it easier to get buy-in for your work, and lets you present data science in a way that business users are already familiar with.
Let’s take a look at why visualization is so important, as well as some of the specific ways that using Tableau to present the work you do in RapidMiner can improve the business impact of your work.
Why Data (Science) Visualization Matters
People who don’t spend a lot of time with data often think of data visualization and data science as completely different domains.
While this thinking is inherently flawed, it’s easy to see how it could develop. Business intelligence and data science have traditionally been different functions, and their practitioners use different tools for their work. Business analysts use software that lets them visualize historical data, while data science and machine learning platforms use that data to create models and, ultimately, predictions. It’s also the case that the most common user of data visualization and dashboarding technologies typically possess very different skills and responsibilities than your typical data scientist.
However, this split—thinking of data science and business intelligence as separate domains—is itself a problem, and it can doom the impact that data science (and business intelligence for that matter) can have on your organization.
Data visualization, analytics, and data science are inexorably linked, which is why one of our key principles at RapidMiner is making data science accessible to anyone. After all, a data scientist’s main role is helping to solve the business problems that are encountered during the work of business intelligence, and if business analysts can do some of their work for themselves, they’re better able to understand their data and work with data scientists.
So connected are these two domains that there’s even a school of thought that says it’s impossible to get a data science project across the finish line if you don’t have a foundation in basic descriptive reporting and business intelligence.
In theory, this aligns with the tried-and-true CRISP-DM model, which dictates that you need to start with the business problem at hand and repeatedly re-visit this problem in cycles to ensure that the results you’re getting are aligned with the original goal, as well as the inherently iterative nature of data science and machine learning projects.
In practice, this means that the finished product of a data science project should be something like a dashboard or chart that informs better decisions-making across the business and/or shows the impact of how the model is driving business value, whether automatically or by generating insight that humans can act on. This is the ultimate goal of any data science initiative—not just create business value but create measurable and explainable business value.
With that overview in place, let’s explore a few different ways that visualization is critical to the data science lifecycle, and why data scientists should be keen to use visualization as a part of their work.
Data science is sometimes defined as trying to find a predictive signal in your data. Though in reality this is only one step of the data science lifecycle, trying to find a predicative signal is the principal goal of the data exploration phase. Data exploration is exactly what it sounds like: fishing through mountains of data to find something that needs to be fixed, shaped, or transformed with data cleaning and preparation, or to find trends in your data that may influence the model creation part of your work.
Not having robust data visualizations to support your data exploration efforts is like navigating uncharted territory. You’re going step by step—and sometimes even row by row and column by column—through tables of data, which is tedious, slow, and prone to errors.
While exploring the unknown might be fun and glamorous in the real world, that’s not the case when it comes to data science. In the business world, this is just a recipe for frustration and delayed business impact. According to Accelerate Your Data-Driven Transformation, a commissioned study conducted by Forrester Consulting on behalf of RapidMiner, companies that have 11 or more models in production are seeing 5.8 times their investment on data science projects, while companies with 10 or less are only seeing 3.8. And while both groups of companies expect those numbers to grow in the next two to three years, the 11+ group expects the ROI to be 9.3 times, while the 10 or less group expects to see only 5.5 times their ROI. These figures highlight the importance of getting more models into production sooner to fully reap the benefits of data science.
So just as you’re more likely to get where you’re going quickly and efficiently with a navigation app and turn-by-turn directions, the insight that visualization can provide can speed your trip through data exploration so that you can get to model development and training quickly and efficiently.
Accelerate Your Data-Driven Transformation
Read our study conducted by Forrester Consulting to get a better understanding of how other organizations are achieving success and planning for the future.
What good is a robust and accurate model if you can’t explain it to your peers, management, and outside stakeholders? If you’re a data scientist, you’re probably thinking of all the ways you could use such a model. But if you can’t explain it and its results, it’s unlikely the model is going to make it into production.
This is one of the biggest issues causing the Model Impact Epidemic—leaders are often not willing to gamble on putting something into production that they don’t understand, no matter how confident you are as the person who built the model. This is why the ability to explain your model is so critical to make sure that your work is having real business impact.
And this is where Tableau can really help. Business leaders and other stakeholders are used to looking at Tableau dashboards and visualizations, so if you can package up your predictions and results in Tableau (more about how to do this below) you’re well on your way to getting buy-in for your project.
Model operations are mainly thought of as the practice of deploying and managing models at scale. But I believe that anything that helps you get your model into production quickly and effectively can be viewed as part of the model operationalization process. After all, the first step is productionizing a model—if you don’t get it into production, there’s no operationalization to be had.
But how can you convince stakeholders to invest resources into productionizing and operationalizing a model if they don’t understand it? How can you easily monitor the ongoing impact of the model if you’re not projecting the model’s predictions and value into a visualization of some sort? How can you quickly detect problems with your model, like model drift, if you’re not viewing the performance of the model in a dashboard? How can people draw insight and make better decisions based on your hard work if you’re not visualizing it?
The answer to all of these questions is simple: you can’t. Or at least not easily; it’s going to be an uphill battle, which is why visualization is a foundational component of model operations.
Putting Tableau and RapidMiner Together
There’s tremendous potential for both business analysts and data scientists of all skill levels to leverage Tableau’s market-leading BI platform to help them explore data, explain models, and operationalize what they’ve built. With a few clicks, you can articulate your model in the common visual language of business, garnering support and helping iterate on your work.
The integration between Tableau and RapidMiner is bi-directional and leverages the Tableau Analytics Extension, as well as the Tableau Server web API. It allows you to:
- Prepare data sets for Tableau, since most RapidMiner users are already familiar with that our simple approach to data prep.
- Visualize your real-time predictions in Tableau dashboards to bring your models where the rest of the business lives—their Tableau BI environment.
- Easily enrich existing, mission-critical dashboards with the future using your knowledge of RapidMiner.
Data scientists have always had the ability to visualize their work, but with this integration, doing so is quicker and easier than ever before, letting you easily integrate with a leading visualization platform to create compelling explanations of your work. This enables better collaboration with all of your stakeholders so you can explain how models work and why they make the predictions that they do. That’s depth for data scientists, simplified for everyone else.
If you’d like to learn more about our integration with Tableau, and see it in action, check out our on-demand webinar Better Together: An end-to-end data science & visual analytics environment.
Learn more about data-driven process optimization for the beverage industry from Martin Schmitz’s presentation at The DATAcated Conference 2021.
Looking to leverage our new Tableau integration? We’ve got you covered. Take a look at how RapidMiner can augment Tableau’s autoML capabilities.