Tips & Tricks: How to use Python and R with RapidMiner

Tips & Tricks: How to use Python and R with RapidMiner

Pull in your Python and R scripts seamlessly in RapidMiner

Long time RapidMiner Studio users already now about our seamless integration of Python, R, and RapidMiner. You just need to visit the Marketplace and download the Python Scripting and R Scripting operators and restart RapidMiner Studio.  Once you do, you should see two new operators under the Utility > Scripting directory in you Operator list.


Using those operators is a snap if already have Python and/or R script ready to go. All you need to do now is drag in those operators and embed your Python or R code and run your process. The nice thing about this is that you can go from RapidMiner to Python, then to R, and back again.  Below is a example process that uses the Finance & Economics extension to download some S&P500 (^GSPC) data and then uses an R script to calculate the ARIMA point forecast.  You can see how easily the R Scripting operator mixes with native RapidMiner Studio operators (Rename) and even other Studio extensions.



Working these operators is quite easy, but there a few things you need to be aware of so you don’t pull out your hair. Usually it’s as easy as clipping your R script (or Python script for that matter) into the “{…}” of the function. You can also have multiple inputs and output results to these operators by adding and returning more functions. If you want to pass more inputs to a script, all you need to do is something like this:

rm_main = function (model, data)

{ my awesome R script …

return (dataframe1, dataframe2)


Tip: Just make sure if you call an libraries in R (or import any modules in Python) that you have access and a path to them.


Here’s the results.



You can try out this sample process by downloading this: gspc-arima-example. The one thing that we don’t show you here is how you can embed an R script (or Python for that matter) into an actual RapidMiner operator, like a Cross Validation.

Embedding Python and R inside RapidMiner operators

You can easily embed these scripts inside any RapidMiner operator that has a subprocess. This knowledge base article explains how you can use the RapidMiner Cross Validation operator and embed an R script algorithm to train and then test the model. You can also extend RapidMiner macros INTO your scripts and change them on the fly! This is cool because you can easily use the Optimize Parameters operator to optimize the output of your script!

It’s really easy to optimize your scripts if you use Set Macro operators. The screenshots below show the setting of the p, d, and q values for the ARIMA R script.


Next, call the macros in your R Script:



Set the range of the paramters for the macros you want to optimize.


Then hit run!  If you use a log operator you can export the AIC performance vs the different combinations of p, d, and q values. It’s that simple and you’ll love how easy it is use.


Marketing Data Scientist for RapidMiner

Leave a Comment