What is ensemble learning?
Ensemble learning is a method for improving predictive performance by combining different machine learning models—much as you might blend different instruments and voices in a musical ensemble to produce a better sound. This meta-approach helps to minimizes two major factors that can cause machine learning error: bias and variance.
Here, bias has a technical meaning, and refers to a model that is underfitting to the data and thus doesn’t produce accurate predictions. On the other hand, variance refers to a model that overfits to the data, and thus produces fluctuations in its predictions when the dataset changes.
Modeling always involves a tradeoff between bias and variance. But by combining several base models (often referred to as “weak learners”) that may individually suffer from high bias or high variance, ensemble learning creates a balanced “strong learner” model with relatively low levels of bias and variance.
Different types of ensemble learning
There are three main types of ensemble learning strategies:
Bagging seeks to generate a diverse group of ensemble learners by varying the training data, a process that involves pulling subsets of data from a larger dataset, then reinserting them after analysis. The models created through analysis of these subsets can then be aggregated using a deterministic averaging process.
Boosting involves training weak learner models sequentially, meaning that the second model attempts to correct inaccurate predictions from the first model, the third model corrects inaccurate predictions from the second model, and on and on. With this strategy, the training dataset remains the same through each round of analysis.
Stacking involves training several weak learner models in parallel (independently from one another) and then training a strong learner meta-model to make a final prediction based on the different weak models’ predictions.
How to approach ensemble learning
Choosing the right ensemble strategy will depend on the problem you’re trying to solve and the type of data that you’re working with.
In general, base models with low levels of bias but high levels of variance are well suited for bagging, while base models with high bias and low variance are better suited for boosting. Any of these approaches, though, often produce more accurate predictions and better model stability than a single model.
Ensemble learning can have drawbacks, however:
- Complexity: Because ensemble learning involves the combination of different models, the results it produces can be difficult to interpret or to explain to others.
- Investment: Ensemble learning models generally “cost” more to create, train, and deploy. ROI is thus an important consideration.
- Time: The time needed for computation and model design can be significant. Ensemble learning is thus not a good fit for real-time (low latency) applications.
- Redundancy: In some scenarios, the ensemble may perform no better (or even worse) than the best-performing weak learner.
For these reasons, it’s important to work with a data scientist who can help you customize an ensemble learning model that addresses your specific needs. As computer computational power has increased over the past decade, so has the number of applications for ensemble learning—including financial fraud detection, remote sensing, computer security, facial recognition, and many more.