If you’ve just started to explore the ways that machine learning can impact your business, the first questions you’re likely to come across are what are all of the different types of machine learning algorithms, what are they good for, and which one should I choose for my project? This post will help you answer those questions.
There are a few different ways to categorize machine learning algorithms. One way is based on what the training data looks like. There are three different categories used by data scientists with respect to training data:
- Supervised, where the algorithms are trained based on labeled historical data—which has often been annotated by humans—to try and predict future results.
- Unsupervised, by contrast, uses unlabeled data that the algorithms try to make sense of by extracting rules or patterns on their own.
- Semi-supervised, which is a mix of the two above methods, usually with the preponderance of data being unlabeled, and a small amount of supervised (labeled) data.
Another way to classify algorithms—and one that’s more practical from a business perspective—is to categorize them based on how they work and what kinds of problems they can solve, which is what we’ll do here.
There are three basic categories here as well: regression, clustering, and classification algorithms. Let’s jump into each.
There are basically two kinds of regression algorithms that we commonly see in business environments. These are based on the same regression that might be familiar to you from statistics.
1. Linear regression
Described very simply, linear regression plots a line based on a set of data points, called the dependent variable (plotted on the y-axis) and the explanatory variable (plotted on the x-axis).
Linear regression is a commonly used statistical model that can be thought of as a kind of Swiss Army knife for understanding numerical data. For example, linear regression can be used to understand the impact of price changes on goods and services by mapping the sales of various prices against its sales, in order to help guide pricing decisions. Depending on the specific use case, some of the variants of linear regression, including ridge regression, lasso regression, and polynomial regression might be suitable as well.
ARIMA (“autoregressive integrated moving average”) models can be considered a special type of regression model.
It allows you to explore time-dependent data points because it understands data points as a sequence, rather than as independent from one another. For this reason, ARIMA models are especially useful for conducting time-series analyses, for example, demand and price forecasting.
Clustering algorithms are typically used to find groups in a dataset, and there’s a few different types of algorithms that can do this.
3. k-means clustering
k-means clustering is generally used to segregate groups with related characteristics and group them together.
Businesses looking to develop customer segmentation strategies might use k-means clustering to better target marketing campaigns that groups of customers should respond to. Another use case for k-means clustering would be detecting insurance fraud, using historical data that in the past had showed tendencies to defraud the insurance provider to examine current cases.
4. Agglomerative & divisive clustering
Agglomerative clustering is a method used for finding hierarchal relationships for data clusters.
It uses a bottom-up approach, putting each individual data point into its own cluster, and then merging similar clusters together. By contrast, divisive clustering takes the opposite approach, and assumes all the data points are in the same cluster and then divides similar clusters from there.
A timely use case for these clustering algorithms is tracking viruses. By using DNA analysis, scientists are able to better understand mutation rates and transmission patterns.
Classification algorithms are similar to clustering algorithms, but while clustering algorithms are used to both find the categories in data and sort data points into those categories, classification algorithms sort data into predefined categories.
5. k-nearest neighbors
Not to be confused with k-means clustering, k-nearest neighbors is a pattern classification method that looks at the data presented, scans through all past experiences, and identifies the one that is the most similar.
k-nearest neighbors is often used for activity analysis in credit card transactions, comparing transactions to previous ones. Abnormal behavior, like using a credit to make a purchase in another country, might trigger a call from the card issuers fraud detection unit. This machine learning algorithm can also be used for visual pattern recognition, and it’s now frequently used as part of retailers’ loss prevention tactics.
6. Tree-based algorithms
Tree-based algorithms, including decision trees, random forests, and gradient-boosted trees are used to solve classification problems. Decision trees excel at understanding data sets that have many categorical variables and can be effective even when some data is missing.
They’re primarily used for predictive modeling, and are helpful in marketing, answering questions like “which tactics should we be doing more of?” A decision tree might help an email marketer decide which customers would be more likely to order based on specific offers.
A random forest algorithm uses multiple trees to come up with a more complete analysis. In a random forest algorithm, multiple trees are created, and the forest uses the average decisions of its trees to make a prediction.
Gradient-boosted trees (GBTs) also use decision trees but rely on an iterative approach to correct for any mistakes in the individual decision tree models. GBTs are widely considered to be one of the most powerful predictive methods available to data scientists and can be used by manufacturers to optimize the pricing of a product or service for maximum profit, among other use cases.
7. Support vector machine
A support vector machine (SVM) is, according to some practitioners, the most popular machine learning algorithm. It’s a classification (or sometimes a regression) algorithm that’s used to separate a dataset into classes, for example two different classes might be separated by a line that demarcates a distinction between the classes.
There could be an infinite number of lines that do the job, but SVM helps find the optimal line. Data scientists are using SVMs in a wide variety of business applications, including classifying images, detecting faces, recognizing handwriting, and bioinformatics.
8. Neural networks
Neural networks are a set of algorithms designed to recognize patterns and mimic, as much as possible, the human brain. Neural nets, like the brain, are able to adapt to changing conditions, even ones that weren’t originally intended.
A neural net can be taught to recognize, say, an image of a dog by providing a training set of images of dogs. Once the algorithm processes the training set, it can then classify novel images into ‘dogs’ or ‘not dogs’. Neural networks work on more than just images, though, and can be used for text, audio, time-series data, and more. There are many different types of neural networks, all optimized for the specific tasks they’re intended to work on.
Some of the business applications for neural networks are weather prediction, face detection and recognition, transcribing speech into text, and stock market forecasting. Marketers are using neural networks to target specific content and offers to customers who would be most ready to act on the content.
Deep learning is really a subset of neural networks, where algorithms ‘learn’ by analyzing large datasets. Deep learning has a myriad of business uses, and in many cases, it can outperform the more general machine learning algorithms. Deep learning doesn’t generally require human inputs for feature creation, for example, so it’s good at understanding text, voice and image recognition, autonomous driving, and many other uses.
Other machine learning algorithm types
In addition to the above categories, there are other types of algorithms that can be used during model creation and training to help the process, like fuzzy matching and feature selection algorithms.
9. Fuzzy matching
Fuzzy matching is a type of clustering algorithm that can make matches even when items aren’t exactly the same, due to data issues like typos. For some natural language processing tasks, preprocessing with fuzzy matching can improve results by three to five percent.
A typical use case is customer profile management. Fuzzy matching lets you identify very similar addresses as the same so that only one unique record ID and source file would be used for the two similar addresses.
10. Feature selection algorithms
Feature selection algorithms are used to whittle down the number of input parameters from a model. A lower number of input variables may lower the computational cost of running a model, as well as improve the performance of the model.
The commonly used techniques like PCA and MRMR are useful for picking up as much information as possible from a reduced subset of features. Using a subset of features can be beneficial because your model may be less confused by noise and the computation time of your algorithm will go down. Feature selection has been used to show business competitor relationships, for example.
If you want to dive deeper into machine learning, including how to get your first project off the ground, check out RapidMiner’s Human’s Guide to Machine Learning Projects.
A Human’s Guide to Machine Learning Projects
Getting a machine learning project off the ground is hard. The solution to this problem is to build a solid project foundation from the very first stages to set yourself up for success. The process outlined in this guide will help make that easier.
Whether you’re new to data science or extremely experienced – mistakes happen. Here we’ll look at some of the most common data science mistakes and how to avoid them.