As subscription services have risen to dominance in recent years, companies have becoming increasingly aware of the need to figure out which customers are going to churn. Machine learning, with its ability to process huge amounts of data and extract insights that aren’t obvious to human brains, has been a go-to tool to help predict customer churn.
Problem solved! The future is now, and artificial intelligence has solved all of our problems!
Wait, what do you mean it’s not that simple!?
Customer churn is actually trickier to identify and prevent than it might appear at first glance. In this post, I’ll break down the four most common challenges that come up for organizations when they start using machine learning for churn, as well as how to address them, to make sure you’re generating real business value with your churn prediction efforts.
Why Value is Critical for Machine Learning
Here at RapidMiner, we talk a lot about the value of machine learning and analytics. None of the work of data science is of benefit to a business unless it can demonstrate real value and an impact on the organization’s bottom line.
To that end, when we think of using machine learning to solve business problems, we always want to tie it to the value that it’s generating. In fact, we even wrote a whitepaper—Talking Value—about how to calculate the value of a machine learning model in the space of customer analytics. If you read through this guide, you can quickly see why churn is such an interesting use case: it scales with the revenue from your customers and thus with the total revenue of the company.
If churn is so interesting, and it’s such a common use case, why didn’t we use it as the example in Talking Value? Because churn has some extra complications that make it difficult to simplify enough to be a general example—and that’s why we’re writing this blog post. Think of this post as an addendum to Talking Value; reading the whitepaper alongside this blog post will help you understand how to use the thinking in the whitepaper to address customer churn, without falling victim to the most common challenges.
Four Big Churn Modeling Challenges
Here are the four biggest challenges that business encounter when they start thinking about predicting customer churn, as well as how to address them.
Challenge 1: Sleepers
The first common problem in churn detection and prevention is that businesses have very lucrative contracts from people who’ve been using your service for a long time—they signed up and never searched for other options. These people are often called sleepers, and if you start to contact them, you can end up waking a sleeping lion. Reaching out might motivate these sleepers to move to a competitor, or even just cancel their membership with you.
From a machine learning point of view, we think about this as the cost of a false positive: what cost is associated with labeling a loyal customer as someone who is likely to churn? How many of these sleepers will leave if you reach out?
In many businesses, the number of loyal customers is much higher than the number of churners, so the cost of false positive might be high as well—you run the risk of waking up a lot of “sleeping” loyals who weren’t going to cancel their subscription, but now will, because you reminded them about it.
The solution: Do a value assessment. You can use the materials in Talking Value as a guide here, but essentially, you want to try and figure out how many of these false positives will leave if you wake them up with a discounted offer to stay. That way, you can factor that lost revenue into your value calculations to determine if it’s worth waking these customers up.
If that seems like it could be hard to figure out, you’re not wrong. And in fact, that leads us to our next churn challenge.
Challenge 2: Churn may not be predictable from your data
Churn is often a multi-facetted problem. There isn’t a single reason why everyone who churns decides to leave; rather, there are a bunch of different motivations that people have for canceling their subscription. This could be everything from moving to a competitor’s product to cutting back on bills. Depending on the data you have, it might not be possible to detect a certain pattern or motivation of churn.
Let’s take my cell phone contract as an example. When I was a student, I opted for a low-cost, pre-paid plan. The network quality wasn’t very good, and it didn’t have roaming included, but it was cheap, which is what I needed at the time.
When I moved into my business life, my needs increased, as did my ability to pay for a more expensive plan. As a European who frequently travels to the United States, I needed roaming and I could easily afford better network quality.
So the reason that I churned from the plan I purchased as a student was that my life had changed. How could my mobile provider possibly know this from his data? Maybe they simply wouldn’t be able to? If there is no pattern in your data, you can just not predict it.
The solution: Talk to your domain experts before you get started. You won’t be the first one to think about the problem of churn. As usual, you want to include the domain experts, or even enable them to play around with possible ML solutions themselves with tools like RapidMiner Go, in order to figure out what you can—and potentially can’t—predict from the data you have.
Challenge 3: Deploying often requires human trust
Frequently, people who work with data forget the human component of businesses. Before starting a churn detection project, you should ask yourself questions about how your churn score might be used. Is it triggering automated emails or text messages? Will the information be used to inform a local agent to take action?
Asking other people to take action based on your model requires something outside of your data and outside of your ability to control: human trust. Your model will never be successful if your scores are not treated seriously or if the humans nit-pick on specific data points rather than trusting the model.
The solution: You have two options here. The first is to enable your end-users. RapidMiner Go is an awesome platform for this. You can use it to teach the end users what models can and can’t do and why you trust them. You effectively transform them into data citizens who then trust you not because they have to, because they want to. In fact, you could even teach them enough that they could start building some models of their own for their work.
The other option that is helpful here is explainable AI. You can use algorithms like LIME or SHAP to not simply give a prediction, but also a reason why the algorithm made the decision that it did. This increases the trust that people have in a model’s predictions. I would be a bit careful here, since this can also backfire if the reasoning given by those algorithms is counter intuitive to what people expect. Nonetheless, it can be a good step in getting buy-in for your models.
Challenge 4: Predicting churn is NOT the problem
This is perhaps the most important point, which is why I left it for last. You don’t actually want to predict churn. What you want to do is prevent churn in such a way that you come out with more revenue than you would have had if you’d done nothing.
Let’s say that, when I quit my mobile contract, my provider had a 100% correct indicator to see that I was about to churn. They could call me and email me all the discounts they wanted. The problem was, I wasn’t interested in discounts! I needed a good offer for a business plan rather than a cheap, discounted plan. In this scenario, it would be clear that the provided didn’t have a good ‘next best action’ model in place to take appropriate actions to try and save a customer.
What they should have done instead was offer me other plans that would work better for my needs, and maybe include a small discount for being a current customer. That would have provided a lot more incentive for me to stay and it also would have increased the amount of revenue that the company was getting from me each month, rather than losing it entirely.
The solution: The best churn models don’t just predict who will churn but also prescribe an action to try and prevent it. Because only prevented churn is generated value.
There are two options here. First, you could build separate models to predict different churn reasons, like a “Price Too High” and a “Bad Service” model. You can then use business rules for the different models to make targeted offers. A second approach would be to use two models. One to predict churn, the other to predict or prescribe the next best actions, i.e., the action that reduces the churn likelihood the most. This would be a conversion model.
So, Churn’s a Bad Use Case?
No! As we said at the outset, with more and more businesses moving to subscription-based revenue models, being able to predict and then prevent churn is potentially one of the most impactful machine learning use cases out there.
But in order to have that impact, you need to make sure you’re doing churn prediction and prevention right. That means:
- Weighing the costs and benefits of waking any sleepers
- Making sure you can predict churn from your data
- Getting buy-in from other humans involved in the process
- Offering potential churners the right incentives to stay
Add to these points the critical message of Talking Value—that you need to understand the revenue implications for any action you’re going to take, including things like false positives—and you’re ready to use machine learning to build a churn model that doesn’t just predict when someone is about to churn, but offers you an appropriate action to take to help keep them as a customer.
Prove the worth of your machine learning projects in four easy steps
Getting buy-in on machine learning projects is hard, as is ensuring you’re making the right decision based on your model’s predictions. The best way by far to solve these common problems is to understand what your model is saying in terms of cold, hard cash. But how?
This whitepaper will show you how.
Predictive analytics enables marketers to transform data into actionable insights & continuously improve strategies. Here are 10 ways it can be used to drive performance.