01 August 2016

Blog

Gradient Boosted Trees? Deep Learning? In Less Than 5 Minutes? You Bet!

As most of you are already aware, RapidMiner is a kick-ass platform offering pretty much everything you need for doing data science in a very efficient way.  But what you don’t know is that …

RapidMiner Studio just got even more awesome! Wait… is this even possible?  Well, it was no easy task – but we have done it: Introducing RapidMiner Studio 7.2. Let’s take a look at some of the new features:

New Machine Learning Algorithms

We’ve added 4 new algorithms for machine learning, and I am still having a hard time figuring out which one I like the most:

Naturally, I gave them a test run on some data sets, and was pretty freakin’ impressed with the prediction accuracy, automatic tuning capabilities, and runtimes.  On the well-known Sonar data set, for example, I consistently achieved performance results of 78% to 80% without any parameter tuning. This is a nice bump over other algorithms which only get up to 70% to 75% after heavy optimization circles.

This lift in performance can in part be attributed to the fact that these algorithms tune themselves. They are designed to find the best parameter settings for optimizing prediction accuracy.  This not only delivers better accuracy; but also reduces some of the effort required for tuning these bad boys.

By now, you might be thinking to yourself – hmmmmm, won’t this negatively impact runtime?  No, not really, since all of these new algorithms support parallel processing for multiple cores. Welcome to the next generation of machine learning, my friends – compliments of RapidMiner!

Say “Hello” to the newest operators in RapidMiner Studio: Gradient Boosted Trees, Deep Learning, Generalized Linear Models, and a brand-new implementation of Logistic Regression.
Say “Hello” to the newest operators in RapidMiner Studio: Gradient Boosted Trees, Deep Learning, Generalized Linear Models, and a brand-new implementation of Logistic Regression.

Updated Operator Search

You will also find in RapidMiner Studio 7.2, that we expanded our search functionality to support tags in addition to operator names. Why is this a big deal?  Because there are so many different data science terms that reference the same thing – that it may not be entirely obvious which to use.  For example, let’s say you want to get rid of some of the columns in your data, but you are not sure which operator to use. Instead of having to remember the exact name of the operator, now you can type in “filter” or “remove” or a variety of other possible search terms, in order to locate the correct operator (which is called “Select Attributes” by the way). Although simple, this key enhancement proved to be very helpful for many customers who participated in our beta program.   Less remembering, more mining. Sounds like a good deal to me.

Operators in RapidMiner Studio 7.2
Operators can now be easily found with the new tag-based operator search.

Performance Enhancements

In addition to all those new operators now making use of parallel cores (if you have them), we also made an effort to speed things up a bit in other places.  A few operators we improved are: Write Excel, Set Minus, Join, Nominal to Numerical, and FPGrowth.  Our PostgreSQL connector got a nice speedup as well.

We also brought in an exterminator – so there are a couple of dozen bugs that won’t annoy you any longer. Zap!

All Features for Everybody

The free version of our RapidMiner Studio 7.2 now includes functionality that was previously only available in the commercial version, for example, connectors to commercial databases.  Moving forward, all the features of our RapidMiner platform will be available to EVERYONE.  I told you it was even more AWESOME!  This new release will allow you to truly experience the value of RapidMiner Studio, while providing us with the necessary resources to keep building great products. I love it when a plan comes together!

So stay tuned – we will publish more about these exciting changes next week with the official release of RapidMiner Studio 7.2.

There have been some major advancements to the RapidMiner platform since this article was originally published. We’re on a mission to make machine learning more accessible to anyone. For more details, check out our latest release.

Related Resources