13 September 2016

Blog

RapidMiner vs R: How to use Python and R Together with RapidMiner

How to use Python and R together with RapidMiner

There have been some major advancements to the RapidMiner platform since this article was originally published. We’re on a mission to make machine learning more accessible to anyone. For more details, check out our latest release.

Pull in your Python and R scripts seamlessly

Long time RapidMiner Studio users already know about our seamless integration with Python and R. You just need to visit the Marketplace and download the Python Scripting and R Scripting operators and restart RapidMiner Studio. Once you do that, you should see two new operators under the Utility > Scripting directory in your Operators list.

executepyandr

Using those operators is a snap if you already have Python and/or R script ready to go. All you need to do now is drag in those operators and embed your Python or R code and run your process. The nice thing about this is that you can go from RapidMiner to Python, then to R and back again.

Example of RapidMiner and R

Below is an example process that uses the Finance & Economics extension to download some S&P500 (^GSPC) data and then uses an R script to calculate the ARIMA point forecast. You can see how easily the R Scripting operator mixes with native RapidMiner Studio operators (Rename) and even other Studio extensions.

arima-forecast

Working these operators is quite easy, but there a few things you need to be aware of so you don’t pull out your hair. Usually it’s as easy as clipping your R script (or Python script for that matter) into the “{…}” of the function. You can also have multiple inputs and output results to these operators by adding and returning more functions. If you want to pass more inputs to a script, all you need to do is something like this:

rm_main = function (model, data)

{ my awesome R script …

return (dataframe1, dataframe2)

}

Tip: Just make sure if you call libraries in R (or import any modules in Python) that you have access and a path to them.

rapidminer vs r

Here’s the results:

point-forecast

Embedding Python and R inside RapidMiner operators

You can easily embed these scripts inside any RapidMiner operator that has a subprocess. This knowledge base article explains how you can use the RapidMiner Cross Validation operator and embed a R script algorithm to train and then test the model.

You can also extend RapidMiner macros INTO your scripts and change them on the fly! This is cool because you can easily use the Optimize Parameters operator to optimize the output of your script!

It’s really easy to optimize your scripts if you use Set Macro operators. The screenshots below show the setting of the p, d, and q values for the ARIMA R script.

op-create-macros

Next, call the macros in your R Script:

op-r-script

Set the range of the parameters for the macros you want to optimize:

op-macros

Then hit run!  If you use a log operator you can export the AIC performance vs the different combinations of p, d, and q values. It’s that simple and you’ll love how easy it is to use.

Final thoughts

Remember, it’s not RapidMiner vs R. It’s how you use them together.

As we mentioned above, there have been several major advancements to the RapidMiner platform since this post was originally published. Be sure to check out our latest release.

Related Resources