

Validating machine learning models isn’t easy, but it’s a critical part of your project. Here’s how to do it the right way.
Calculating model accuracy is a critical part of any machine learning project, yet many data science tools make it difficult or impossible to assess the true accuracy of a model. Typically, tools only validate the model selection itself – not what happens around the selection. Or, even worse, they don’t support tried and true techniques like cross-validation.
This whitepaper addresses the four main components to ensure that your validating machine learning models correctly, and how this type of validation works in the RapidMiner platform.
A Brief Introduction
All data scientists have been in a situation where they think a machine learning model will do a great job of predicting something, but once it’s in production, it doesn’t perform as well as expected. In the best case, this is only an annoying waste of time. But in the worst case, a model performing unexpectedly can cost millions of dollars – or potentially even human lives!
So was the predictive model wrong in those cases? Possibly. But, oftentimes, it’s not the model which is wrong. Instead it’s how the model was validated. A wrong validation leads to over-optimistic expectations of what will happen in production.
Since the consequences are often dire, we’re going to discuss how to prevent mistakes in model validation and the necessary components of a correct validation. Read more in the whitepaper below.
correct-validation