painting with splatters

13 September 2022

Blog

How to Make Impactful Machine Learning Models, Not “Data Science Wall Art”

As a member of RapidMiner’s Data Science Services team, one of the most common mistakes I see customers make is that they use data science to create a beautiful dashboard… and then it just sits there.

Creating a model whose output has no real impact on the business wastes everyone’s time and resources—from the data scientists who worked on the model to the operators on the shop floor whose work could have benefitted from an effective ML model.

If you create a model with no actionable output, you haven’t just spent hours of work doing data science, you’ve been creating what I like to call “data science wall art.” 

In this post, we will offer suggestions for how to avoid throwing data into the ether and use it to make actionable models instead.

What “Data Science Wall Art” Looks Like & How to Avoid It 

There is a big misconception that just doing data science is enough. I hate to break it to you, but it’s not. Behind every data science project, there is a big “So what?,” and if you can’t answer that question, your model is essentially useless. 

A Real-World Example 

Manufacturing is one of the industries that stands to gain the most from data science. With all the data manufacturers collect on the shop floor, there are endless possibilities to make production more efficient and cost-effective, reduce product time to market, and improve overall product quality. 

But, so many times, engineers are so hyper-focused on creating a model that will satiate their curiosity that they don’t stop to consider if what they are building is even worth a use case. When this goes unchecked, useless models are created, and the rest of the organization starts to lose trust in data science. Why should they continue to invest in something that provides no business value? 

For example, say you are using real-time sensor data to build a machine learning model. That model then tells you that in 20 minutes, your pressure gauge will give high readings. That’s great but…

Without an actionable outcome, you’re left asking, “So what???”

If there is no action to take, the model is not serving any real purpose. 

Our Advice 

First things first—go back to basics. And in data science, “basics” means the CRISP-DM cycle, or the Cross-Industry Standard Process for Data Mining (we walk through this in detail in our Human’s Guide to Machine Learning Projects).  

The first phase of CRISP-DM is Business Understanding, and it is what we often find lacking in models that are not having their intended impact. In this stage, users scope out the project and define its business expectations.

When you design a use case, the first question you should ask is:

What will you do with the information from the model compared to what you would do if you did not have it?  

That will help you define:

  1. If you have a viable use case, and 
  2. What potential actions the model will lead you to take 

For manufacturers especially, it is essential to get your domain experts involved in this step—they’re the ones who know the business like the back of their hand.

Successful data science initiatives are a team sport; getting multi-disciplinary, multi-department involvement in your projects and getting input from team members with different skillsets and backgrounds is the best way to break down silos and ensure you are building the best possible solution. 

Let’s go back to our real-time sensor data example. What would you do differently if you could predict pressure changes in advance?

Rather than shrugging your shoulders and sending the model into production regardless, your looped-in domain expert could document the current process and cue the model to alert the operator to pull a level and adjust the pressure levels. 

Wrapping Up 

When you isolate your models, they are far less likely to have their intended impact. To design data science projects that have an actionable outcome, you need to avoid putting your data science team in a silo and instead encourage cross-team collaboration from the moment models are conceptualized. 

To do so, you need to encourage true team transparency and build a cross-functional understanding, not only of individual models, but of data science concepts as a whole. 

If you want another set of eyes to help you determine your highest value data science use cases, our team is here to help. Request a free AI Assessment.

Related Resources