Building models to predict customer actions is just the first step in implementing data mining and predictive analytics solutions. The integration of the models with your business processes is a critical step to a successful analytics program.
In this webinar we’ll discuss leveraging models for:
- Agent Churn: Predict which center agents are most likely to quit and the primary drivers for attrition to reduce turnover and decrease hiring expenses
- Customer Churn: Predict which customers are most likely to stop using your services and integrate with your CRM to enhance call center agent knowledge
- Cross Sell Opportunities: Predict which customers are most likely to accept a new offer or service and integrate with your CRM to improve revenue by targeted selling
Hello, everyone! And thank you for joining us for today’s webinar, Enhancing Call Center Operations with RapidMiner. I’m Hayley Matusow with RapidMiner and I’ll be your moderator for today’s session. We’re joined today by Derek Wilson. Derek Wilson is president and CEO of CDO Advisors LLC, a data mining predictive analytics and data management company. In his current role, he is responsible for the overall architecture, strategy, and delivery of business intelligence, analytics, and predictive solutions. He’s focused on transforming how companies leveraged data to gain new insights about their customers and operations to drive revenue and decrease expenses. Derek will get started in just a few minutes but, first, a few housekeeping items for those on the line. Today’s webinar is being recorded and you’ll receive a link to the on-demand version via email within one to two business days. You’re free to share that link with colleagues who are not able to attend today’s live session. Second, if you have any trouble with audio or video today, your best bet is to try logging out and logging back in which should resolve the issue in most cases. Finally, we’ll have a question-and-answer session at the end of today’s presentation. Please feel free to ask questions at anytime via the questions panel on the right-hand side of your screen. We’ll leave time at the end to get to everyone’s questions. I’ll now go ahead and pass it over to Derek.
All right. Thanks, Hayley! Thanks, everybody, for attending. The real focus of this session today is to go through not only the deep technical pieces of RapidMiner that you might have seen on other RapidMiner webinars they’ve done, but to really walk through what are business cases and how do you actually use RapidMiner Studio and RapidMiner Server together to operationalize those and add value to your business. So as Hayley said, I’m president and CEO of CDO Advisors. I’ve been using RapidMiner for over three years now and have found lots of great opportunities to use the tool to really solve business problems.
So let’s get started. So on our agenda, we’re going to do a quick overview of the RapidMiner platform, get into the different RapidMiner software that’s out there, and then look at three different use cases. One is Agent Churn which is call center agents. Those that are your staff at your call centers. How can you predict which agents would churn? Also, look at customer churn. So from a call center perspective, what can you do to look at when customers that you have potentially are leaving and what can you do about that? And the last thing is cross-selling opportunities. So how do you take all the data from your various systems, build it into a model, and feed that information back to your call center team so that when a customer calls in, they already know what to offer a customer as the next best offer or next best action. And then the last piece will be going through implementing models into production operations.
All right. So to start with, this particular session is really going to focus on RapidMiner Studio and RapidMiner Server. So RapidMiner Studio is the platform where you go in and you build the models. Typically, when those companies start with, they have users and many of have probably downloaded at least the free version to have that, to go out and build these models. And then, as you advance, it’d get more complex. RapidMiner Server is the next place to start using the software to deploy your models out to the Server so that you can save those models. You can do the computation out there. If I wanted to build a model, deploy it to the Server and then have another RapidMiner Studio user open that model and modify it. There’s a repository feature out there where they can just go out pick up the model. You can schedule it. You can integrate this with other things. And we’ll touch on bits and pieces of all this as we get through the demonstrations.
All right. So, first, we’re going to do a quick survey just to kind of set the tone and see who all is using what. So we just want to figure out, hey, how many of you are actually using RapidMiner Studio? So I’m going to launch a quick survey here, but this has five questions, so if you can fill those out. We’re almost up to 100%. All right, three more seconds. Okay. So most of you on the webinar looked to be using the free version and the next largest chunk of people in the webinar actually are not using RapidMiner Studio at all. So this would be a good intro for those of you that haven’t used it as well as those that are using the free version to really see how to apply this to your business cases.
All right. Okay. So the first thing we’re going to do, the model on the left, the diagram, that’s the CRISP-DM model to go through data mining. If you haven’t seen that, you can you can google it. It’s really on there. It’s a good thing to walk through and at least understand. And the data modeling and data mining is a lot of iterative processes. So it’s not a linear path like, “I’m going to build a calculator application, and at the end of it, 9 times 9 is going to equal 81.” A lot of times, you’re going to pull data, build the model, test the model, change the inputs that you’re putting out there to have your model use. So in this case, I just want to quickly go through– for Agent Churn, the business case is, can we predict which agents are at the highest risk of churn? So that’s the business question that we’re trying to answer. And then the next logical thing is, do we have data that can be useful to understand patterns and why the agents have churned? And then, can we access and process the data and is the quality of the data sufficient? So a lot of times, you might have data but the data quality is poor, or you don’t have access to the data that you really want and it takes longer to get that.
So those are all key features in data preparation, data understanding. And then what kind of models can we build. There’s various different algorithms out there that you can apply to your data. The accuracy of the models is important, but the real goal of this is how do you make the model outcome actionable, and that’s really what we want to focus on. So in this case, for the Agent Churn model overview here, right, you can see the inputs on the left, right, agent performance, call center/HR data. That’s where you would actually go grab the information on your agents. With the output being, a prediction of attrition for each agent, the competence of that prediction. So that’s the data modeling piece. But the business outcome side is really how do you retain the best agents, which attributes lead to high attrition, what can you do to keep more high-performing agents, can training be updated to target weak areas of performance that lead to attrition? So these are all really important questions that you can get some of the information and answers for as you go through and build out your models.
So in the next slide– I kind of break RapidMiner up into kind of two different–
Miner Studio where I’m going to build a model, or multiple models, and figure out exactly what I want to do with that. The operational piece of it is, now that I’ve got a model, how do I output that model to my call center management. So I’m going to break right here and go into a quick demo. So this is my Agent Churn model. Basically, you can see I’ve got an Excel file in this case where I’ve pulled sample data together. So basically what I’ve done is I have a flag here that says did the agent attrit, did they leave the company, yes or no. And then I have other characteristics – it’s easier to look at it like this, right? – the age, do they have to travel, the department, distance from home, education, like all these different bits of information that you can find from your system against information from your past employees. And what you’re really doing in the model is applying– here’s my historical information, I’m going to build the model and apply what the model knows back to my active agents to really understand which of those agents are most likely to attrit.
So in this case, I go through, I run my model. And you can see here, right, I’ve predicted which customers are most likely to attrit. And it actually goes through. In this case, that is the decision tree model that goes through and says, “Hey, this person, they did have overtime; they were over 26 and a half years old; their environmental satisfaction was medium, right; they were under 41; and their stock option flag was no.” So that particular path on this tree led to more agents leaving attrition than agent staying. So just knowing that, I can take that information, work with my call center management team, and say, “Hey, here’s an interesting bit of information that you may not have known about the historical reasons why employees have left.” So that’s a really simple way to do it.
I know if you’re a data scientist on the call, you might like, “Well, a decision tree is not really the best.” And that’s true. The RapidMiner has lots and lots of different types of algorithms. The reason I like to use decision trees especially when you’re getting started with data mining inside of your company is most of your management team, they’re not quants. They’re not data scientists. They have management degrees. They went to business school. They understand that aspect of it. So if you can get people to understand how I take data, how I mine it, how a decision tree algorithm works where they can visually represent it and get them bought in on, “Hey, let me start making business decisions on that data.” Then, later on, you can go out and say, “Well, now, let me try a different type of algorithm.” Let me try random for us. Let me try this, this, and this to get a better– you might have a better more accurate model. But at the end of the day, if your business users don’t understand how it works, they’re not going to do anything actionable with it to improve the business. So if you haven’t started anything or you’re looking to get started, go in small steps, do the basic workflow, build something out in Studio, get data to your management team, work directly with them. You can’t just hand it off to them and really walk through the process of how the model works at least at a high level, and do some A/B testing on the information. That’s the easiest way to get started there.
So next, one more survey. And this is just to understand who all is using RapidMiner Server so we know where this goes in the rest of the presentation here. So I just launched that survey. We’ll give this one another few seconds. Most people have voted. All right. Time is up. All right. So most of the people on this webinar have not used RapidMiner Server. So this will be again a good introduction of how RapidMiner Server would work in this workflow.
So the next piece here is what I call the Agent Churn workflow, the advance. The only real difference here is now I’ve inserted RapidMiner Studio into the mix. So I still have my CRM, my call center data, my HR data. I would still do everything I just did in RapidMiner Studio. Do the demo I showed you. I’ve got the model. I’ve got output. But, now, maybe there’s information in that model that I don’t want to update manually once a month, once a week. So let’s say, there’s a key field in your model such as tardiness and tardiness, of course, can occur daily, multiple times per day potentially. And if that’s a key indicator, you might need to run your model daily every 6 hours, every 12 hours depending on your call center operations to go, “Hey, this person is continually being late.” And that’s a leading indicator. So we can’t have a model run once a month and operationalize that. So in those type of situations, that’s where RapidMiner Server can come in. So in this case, you take your RapidMiner Studio model. You deploy it out to the RapidMiner Server. And then once it’s on the Server, there’s multiple things I can do with that. I can schedule it. I can have the connections in RapidMiner Server. Instead of reading from an Excel file, I can have those run a query, go directly against the database, and then run the results. All of this would be automated on the Server. I can have the output saved to a database to file. You have all the same options as you do in Studio. The difference is I don’t have to sit there and manually run this over and over and over.
And then the next piece that helps with this is as you automate it, you can drop the information out. And like, in this case, if you drop it out to a database, instead of your management team having to access it via Excel or some type of raw CSV or something like that, you could actually build reports on top of it. So you could give them any number of reporting tools that they can go out and just monitor this information. So that’s a generic workflow for how Agent Churn predictions would work.
All right. So the next piece is Customer Churn. Another use case similar to Agent. Same process, right, but the business question is, can we predict which customers are most likely to churn? We have to build and evaluate the models. What are the key attributes of a customer who responded in the past? Can we apply the model, right, operationalize this? And then how do we make the model outcome actionable? Again, the whole point of this is RapidMiner is a great tool if you can operationalize the data, and that’s where you really get the big benefit for your business and ahead of your competition. So for the Customer Churn model, in this case, our inputs would be customer CRM data, operations data. Anything that you have on your customer that you want to feed into your model, that would be your input. The output would be the prediction of attrition for each customer along with the confidence of prediction for that customer. But, again, the focus of this session is on business outcomes, right? What you really want to know is how to retain customers, what are the attributes that lead to high attrition, what operational processes can be changed to reduce attrition, and is there a different product or service that’s a better fit for your customers.
Those might be things that you find out of doing these type of operations that, “Hey, customers that are on product A attrit really highly past the 9th month, but the customers on product B, they stay 24 months.” Was there something different in the operations process between product A and product B? Is it billing cycle? Is it some small fee that maybe you go, “Oh, you know, $5 fee that doesn’t really matter”? But if you’re charging $5 times 9 months and the other product isn’t charging that monthly fee and customers are staying 24 months, maybe you make more margin by not charging the fee. So those are all examples of how that works. So again, from the basic standpoint, right, CRM, call center data, operations data, really anything that you want. And in this case, I’ll show you a quick demo on how to predict churn from a telecom company. And then the same thing, output the files, the data on the files, get that to your call center management team. You don’t need to build extravagant workflows. You’re really just trying to get this started. Figure out what kind of A/B testing is possible and what you can do with the data that you have to say, “Hey, here’s my control group; here’s my experiment group. Let’s see what happens with the experiment group and track those over time.” Data mining and predictive analytics introducing it to a new company is all about building trust and getting your management team to understand how it works and showing the value that it can bring to your organization.
So I’m going to go back into my example. So again, I’m not going to go through how all of this works. But here I’ve got a customer status. You’re either active or you’re churned. And in this case, again, you can see I’ve got a contract type, one year, month to month, different things, if the customer has dependents, if they bought device protection, do they have internet service, multiple lines, online backup. So again, just different characteristics of the products that this particular fictional company has. The customers, do they stream movies, do they stream TV, did they call tech support, gender, spouse, all those type of great information to have. And after going through all of this, I can get to my examples here. And you can see that– let me see churned. Confidence, I actually want– the confidence is inactive, right? So you can see here, I’ve identified customer 46 currently active, high risk of churn at 0.73. So again, this is a decision tree that I started with. I could use this to walk– I can easily walk my management team through customers that are on a month-to-month contract, that have internet service, that’s our 20 Meg product, most of them are churning rather than staying. We have 592 active. We had 1,621 actually churned. Maybe the 20 Meg product is not a good product. Maybe you should be lifting those customers from that product to a different product. So again, this is why the decision tree is really from a visualization and buy-in perspective kind of the go-to algorithm that I try to use. And again, there might be better scientific ones that you can apply, but you have to save those for after your management team trusts and understands how this whole process is working.
Okay. So let me go back to the slides here. All right. So again, from the advanced workflow, not a lot different than the Agent Churn that we saw. You’ll still build your model, the model I just demoed. You have a model. In this case, same thing, I would deploy it to the RapidMiner Server. Once it’s on the Server, I have the same options. I can schedule it to update. I can save the results to a database. I can share it in the repository. Like I said, in this case, again there could be pieces of data where you say, “Hey, every day I need to update the information. So I know for every customer, what’s the likelihood of that customer turning because data is coming in daily.” So in this case, maybe it’s inbound calls, so the number of times the customers called in, if they’re over their usage minutes, all those type of things. Maybe running the model once a week is not sufficient, and you need to run it daily. So in that case, same thing, RapidMiner Server. I can go out schedule it, save the results, provide reports to my management team. And that’s just part of building this process, getting these guys used to seeing from an operations and management perspective, to go, hey, “Here’s another bit of information that you can look at to help make business decisions.”
And I can’t stress this enough. If you give any of these algorithms bad data, you’re going to get bad results. If you give it good data, you’ll get good results, but you still have to have the operational knowledge and what I call the operational lens to look at the information and make judgments on what you see, right? So there could be that the model is telling you one thing, and operations teams looked at it and they go, “Oh, yes, we realize that because we launched that product a year ago. It was a poorly performing product, so don’t even put that one in the model. We already know that that model– or that product was bad. Eliminate that one from the model. We run the model, and let’s see what happens because we already have a campaign underway to switch customers from the poor performing product to a new product.” So that’s another example.
All right. And then the last example here, and this one I’ll actually show you a little bit more on the Server as well, is cross-selling, right? It’s a big thing everybody wants to do right now. I’ve got my customers and I have products and I want to sell more products to more of my customers. So in this case, the business question is, can we predict which customers are most likely to respond to a marketing campaign, right? So we build and evaluate the models. Do we have information on the past about which customers responded the best? How do we apply the model to an active campaign list? Can we operationalize this data? And again, the whole point of all of this is, how do you make the model outcome actionable?
So our inputs on this case would be customer CRM data, operations data. And the output is, right, we want to predict which customers are most likely to accept a new product along with the confidence of that prediction. So from a business outcome perspective, right, who to target for products, we want to limit the budget to the customers that are most likely to accept an offer, right? So if you’re not doing this today, your marketing teams are most likely going out and saying, “Well, we want to target to customers that are over a certain income level. They have kids. They don’t have kids. They live in this zip code.” They’re really going through using the best information that they have which is what they’ve historically always done, but the data may tell the marketing team something completely different that says, “Hey, instead of marketing to that group, if you really market to this group of customers, you don’t have to spend as much and you’ll get more customers on the products.” That’s really what you’re trying to get out of these cross-selling models.
And then the last piece, right, is to really give your call center agents the next best offer to increase sales. So if a customer calls in and they’re already talking to your call center agent and it’s a positive call, they’re not calling in about an issue, then wouldn’t it be great to already have, “Hey, I see that you have product A and B, we have a special on product C”? Or if another customer calls, “I see you have product A, C, and D. Oh, would you like to have a product H?” And all of that information can be derived through the data mining cross-selling model. So there’s endless possibilities for this.
So again, that’s kind of the basic workflow here, RapidMiner Studio, right? I’m getting my CRM, my call center data, my operations data, and I’m going to pull it in. I’m going to build a model. That’s financial cross-sell here. And then I’m going to output my files to my marketing and sales team. And again, the iterative process of this, I’m going to give it to them, they’re going to look at it, and they might do the same thing. Well, we don’t want to market to this particular group because that’s not the customers that we want to acquire, or we want to run a special on this product, not that product. So you really have to engage, and it’s one of those things. Your operations teams and your management teams, they’re not going to know what they don’t know. So a lot of times, I’ve found it helps to build a model. Even if it’s not the perfect model, build something, start having this conversations with your operations teams, and then pose the question that I always use which is, okay, now that you kind of see with the process is – let’s say I gave you a perfect model today that 100% accurately could predict, which we can’t, but hypothetically if we could – what would you do with that information today?
And that really makes them think through their operational processes to go, “Okay. If you can tell me that with 100% accuracy, what do I need to do– or where would I inject this information?” And again, that’s where RapidMiner Server can come in because same process. We’ve got all this information coming in. I’m pulling it in, publishing it to the Server, but they might say, “Hey, I really want this information to be pushed to this location, and then I’m going to have a team look at that.” And then they’re going to feed information back to another team to really figure out and encapsulate and make all of this stuff work the way that it’s supposed to work to build out your operational processes. So again, RapidMiner Server is where you can do a lot of that, the automations. You could publish the Studio model out there, run it, have it save data to different locations, etc. And again, marketing and sales, they can pick up the information and use it in a variety of ways.
So last one here. Before I get into that, let me jump back into this. So for the cross-selling, this is actually pretty interesting. So what happens is on your data set– let me just add a quick break here. So in this case, I’ve got all my accounts for all of my customers, and most of my fields are binary. But in this case, I have account origin ID. There’s a 1 to 4– 0 to 4. Sorry. I have a checking account indicator 0 or 1, savings indicator 0 or 1, credit card, so on and so forth. I have an integer column that tells me the number of inbound calls, how long the customer has been a customer in months, if they’re in my vacation club, holiday club. So again, apply it to your industry. You can build out this entire dataset. And at the end of it, what it’s doing is going through and building association rules. So in this case, my association rule says– now what I’m really looking for in this one is called lift. Let me pull that over. So this model says if I have auto loan and checking, my conclusion is I should have a home loan. So we’ll just stick on this example here.
In this case, what can I do with that information? So if I’m a marketing team– or an operations team– sales team, I could then mine all of my active customers that have auto loans and checking that don’t have a home loan. And then that’s where you have to again look at your operational lens to go, “Okay, is there a subset of these customers that I now want to target?” Right? So maybe it’s not people that are over 75. We don’t want to target them with a new home loan. But maybe it’s customers that are between the ages of 25 and 45. We want to create a brand new product and go after those people for refinance, brand new home loan, whatever your product could be, because I know if they’ve got auto loan and checking, then most of the time my other customers have a home loan as well as– I come down here. You can see here, I have my debit card and my home loan indicator. It means I have an auto loan. So again, another different campaign, same model, but I can now go through and look at it and say, “Okay, let me figure out how many customers have a debit card and have a home loan but don’t have an auto loan.” And in this case, your parameters for who to target could be completely different because you might want to go after younger people. You might want to go after slightly older people. You might have different offers. If you have a great relationship with some insurance company, right, could you package something up and say, “Hey, we can we can work with you to go after our customers that we want to target for auto loans, but we want you to offer a great insurance rate”? Again, possibilities are really endless from what you can do, but this is how the association rules really work.
So again, in this case, right, I’ve got all this stuff put together. At the end of it, I’ve run this model. I’ve saved it out to my Server. Let me pull that one up. So couple extra pieces here, but basically this is the exact same model that I showed you. I read in my file. I do the association rules that we just showed. Then I formatted it into a table and ultimately write the results to a database. So there’s a SQL Server on here that’s storing that information. And if I was to go out to that SQL Server, I can query it and do everything that you normally do a SQL. In this case, I’ve actually published it– no, I had it published to Power BI which of course is not going to. Well, I’ll have to get back on that. Let me pull it up this way then. But once you have it published the SQL, right, then you can wrap it with Tableau, Power BI, any other reporting tool that you want to use. Since that’s not working, I’ll at least come in here and show you how the results look. All right. So I save my information and the output table. Right. And just what we looked at before, I’ve got my premise, my conclusion, all the confidence is out there, my lift. So I can go and write queries against this, share this with my application teams, so on and so forth. So now the really cool piece with all of these is now that I put all these use cases together, I’ve built it in Studio. I’ve published them to Server. I’ve got buy-in from my marketing and sales team, operations teams.
The last piece that we didn’t talk about is the source systems updated, right? So once you have all of this, you can actually create processes to where that SQL Server table that I just showed you– you take that information and then feed it back to a sales engine, your CRM engine, different systems that then allow in this case for the cross-selling could I take that output and say, “Hey, everybody that has a lift greater than 1.0 in that category, we want to go in and just recreate a campaign in the CRM that’s the next best offer.” So when I call in as a customer and it says I have an auto loan, etc., then the system’s already going to tell the agent, “Here’s what you need to be offering that particular customer.” And that’s really where you start getting a lot more value in this, right? So the first stage, get buy-in from your users, get that confidence, build the trust, then start publishing stuff out the Server. Once you have it on Server, now you’re updating your models, you’re sharing things through the repository. Again, you’re building more and more confidence through your enterprise. And this also allows you to integrate it as a part of your enterprise operations just like your BI reports, your operational reports, now your RapidMiner data becomes operational data that everybody wants to have.
All right. So one last pull here to kind of see again where everybody is. So we just want to see, now that we’ve seen this on the three different use cases that I’ve gone through, do any of you have active use cases that you can think of that RapidMiner would be applicable for? People are thinking harder on this one. Couple more seconds. All right. We’re going to close it off. Okay. So 63% of those that voted said, “Yes, they do have an active use case that they’re interested in using RapidMiner for.” That’s great. And again, the whole point of this session was to really kind of– it can be a little bit daunting to get started with any software. But knowing how RapidMiner Studio work, how it integrates with RapidMiner Server, how do you build end-to-end solutions using this product is really what I wanted to focus this webinar on and really let everybody get comfortable with.
So next step, and then I’ll turn it back over to Hayley for questions. If you do have questions, you can reach me. There’s my email and my phone number. We can do a discovery call, figure out what you’re trying to solve, if there’s a good way to do that in RapidMiner, explore what you’ve done already, what are other solutions, right? How can predictive analytics really help you achieve your goals? And then the last part down there, the free RapidMiner reference card, I put together the most commonly used operators in RapidMiner Studio that I feel everybody needs to know. And a lot of them are ones that I use all the time. Some are ones I don’t use as often, but just knowing that they’re out there and how to leverage it is a big time saver. So you can go out. All you have to do is click on that, put in your email address, and you can get that reference card emailed to you. Appreciate everybody’s time. So, Hayley, I will open it up to you.
Great. Thanks, Derek, for the great presentation. And as a reminder to those on the line, we will be sending a recording within the next few business days via email if you happen to miss that at the beginning of the webinar. So like Derek said, now it’s time to go ahead and get your audience questions. I see a few questions that have come through here, so feel free to ask any questions via the questions panel on the right-hand side of your screen. So I’ll go through. I see one question here. This one’s for you, Derek. This person was asking during your presentation, so the RapidMiner model has to be deployed on the Server?
Yes, yes. So when you’re in– let me see if I can pull it back over. So this is my model. I’m in Studio right now and this is the one that saves to a SQL Server. So in this case, I’ve got my repositories. So I save my process and I’ve got my RapidMiner Server repository, which is the actual Server itself, and I save it out there. And then there’s a URL to go to the Server’s web-based, and then I go out to the web and interact with the Server on that way. So you can save locally, but I could do the same thing here locally, save it locally, and it would write to the SQL Server. But every time I wanted to automate, update, anything else, I would have to be the one that opens this up and runs the process. There’s not an automation from RapidMiner Studio.
All right. Thanks, Derek. Another person here, is it possible to retrain the model with new data? Unintended, is it advised?
Well, yeah, so you can. And again part of the process – and I didn’t get into a lot of technical stuff here on purpose, but – yeah, absolutely. When you go through this process– let me go back up to this one. So when I deploy my Server out here, I build my model in Studio, and let’s say I’ve got queries looking to some operational data. So everything is automated as fresh data comes in. Maybe I’m going to use the last 24 months as my criteria, and that’s always going to be a rolling 24 months. My Studio model, I’ll know the accuracy, the precision, how it’s performing, and I want to record that information. And then absolutely as you’re running it, and I push it out to the Server, well the Server is going to do the same thing. It’s just going to say, “Okay, I’m going to get the last 24 months of data. I’m going to run the model.” And as your model degregates over time because maybe it’s bad data, etc., you want to be able to compare. When I published this model, I was at 83% performance and now, two months later, I’m at 78%. So there’s got to be a process as you publish these out. You should be monitoring what’s the performance of your model. You have some type of threshold that says, “Hey, whenever a model drops below whatever it is for your–” it could be for your enterprise. It could be by model, depending upon the amount of data that you have. But you might say, “Hey, whenever a model drops below 77%, I want to go back in and re-evaluate that model.” Maybe there’s new data that you can add to it. Maybe you had a lot of bad data come in. Those type of things, yeah, but care and feeding of your model is very important.
Thanks, Derek. This person asked during your presentation, what sort of savings have you seen with RapidMiner?
It really depends, the common IT answer. But really I’ve done different projects where we’ve seen customers that we had predicted that we’re at a high risk of churn, and we’re able to do a proactive outreach campaign, and we were able to save over 50% of those by moving them to a different product, term, etc. So there’s at least that one. The cross-selling type of opportunities are out there where, again, you can measure it, but as you’re building your models and deploying them, you need to be thinking about that. That was one of the things I traditionally like to do. I’d say, “Okay, what’s the performance of the class of customers that we’re going to impact for the last 6, 9,12 months?” So let’s say churn rate was 25% against a certain class of customers. And you’ve built a model to try to reduce churn on that. Well, then you know, okay, I was at 25%. That’s my control. I’m now doing something to not everybody in that same class, but maybe a subset, right, 1,000, 2,000, 5,000, depending on the number of customers you have. And then you want to monitor that performance over 3, 4, 5, 6 months and see, all right, my control group is at 25%. The group that we’re actually doing something with and we’ve run against RapidMiner, maybe they’re 20%. There’s a 5% savings times the cost of the customer churning. So there’s a lot of different ways you can get an ROI calculation on this because you do have in most cases historical information that you can calculate the costs.
Okay. Thanks, Derek. Another person here is asking– they want to use RapidMiner, but if they’re new to data mining projects, which version of RapidMiner Studio do you recommend for them.
Really, I would say at least small. The free gets you a certain number of– I think with free, you can go out and kind of understand how RapidMiner works and all of that. But if you really want to get in and start mining data, you need to get past kind of the free limitation on rows potentially. So yeah, definitely. I always looked at RapidMiner and the way it’s licensed as a step model. So if you come in on small, use small until you don’t need small anymore, and then go up to medium. And then RapidMiner Server’s out there. So you can use the RapidMiner small license and deploy it through a RapidMiner Server, so.
Great. Another question here. Someone’s asking about streaming processes in RapidMiner. Is that possible?
I honestly don’t know. I haven’t tried to do any type of IoT streaming information.
Great. We can follow up with–
Dylan, do you know? Yeah.
Yeah. That’s something we’ll follow up with the audience on. There are some partners that have some things around that, so. And just real quick on the question around which version to start with. The free version, you’re right, Derek, I would start with the free version. It’s a great way to do kind of what you describe where you really just get acclimated to the tool. So all the data prep and modeling and validation to get acclimated first with the free version. And then like the project you’d show, then moved out to the small. Advantage of the small is that you get access to support and more performance really. So it’s more rows and better performance. But starting with the free version is a great way to go, so.
Thanks, guys. And, yeah, if you guys have questions here, we can also follow up after the webinar. Another question here, sort of a follow-on to the last question is, what is the estimated time to get a project to operation and production?
Did you guys hear that last question?
The question was how long does it take to get a project up to production. Is that it, Hayley?
Yeah, that’s correct. It looks like Derek might have lost audio for just a second there, but, Dylan, if you want to go ahead and address–
Yeah, I’m here.
–that one question.
Yeah, sorry. I got dropped off.
Yeah. So it’s typically a couple of weeks, and Derek’s got hands-on experience here obviously, working on both sides. But, yeah, it’s a couple of weeks to get used to your data. That’s kind of the first step. So typically what we would see is it’s going to be a couple of weeks for initial sort of proof of concept, and then you’re doing piloting of the model. So I would say three months total before you get to the point where you can operationalize something. Derek, do you agree? But–
Yeah. No, sorry. My GoToMeeting shut down. Yeah, I always try to do things. And again, most of your operations teams, they don’t want to wait a year, six months, anything like that. So I would definitely say focus on somewhere between an 8-to-12-week project that you’re providing information. Whether it’s a good model or not, you want to start having these conversations once you understand the data with your business team as soon as possible so that if you’re pulling information in and you’re walking them through how the model works– and like I said earlier if they go, “Oh, yeah, we already know that product’s bad.” All right, let me eliminate that from future models and see if there’s other information I can add in there. So it’s highly iterative. And the sooner that you can start providing some information to your ops teams for them to understand how it works, they’re– and it’s going to drive interest. It’s going to get them excited about it. But I would definitely say too, as you start this process, if you’re not already having conversations with your ops team by week two of just looking at the data and say, “Okay. When I give you this model, what are you going to do with it?” it’s too late because a lot of times they’re already running at 110% capacity. So they really have to think about, “All right. If you give me this, what am I going to do? Right? Am I going to have a outbound team have to do something? Do I have a third party that I have to engage with?” All those type of decisions, they should be thinking about while you’re building out and testing models to get them that information.
Great. Thanks for that answer, Derek and Dylan. So it looks like we’re just about time here. For those that still have questions, if we weren’t able to address those here on the line, we will make sure to follow up with you via email within the next few business days. And we’ll also be sending the recording, like I said, in the next few business days as well. So thanks again, Derek and Dylan, and thanks again, everyone for joining us for today’s presentation. And I hope everyone has a great day.
All right. Thanks, everybody.