Pull in your Python and R scripts seamlessly in RapidMiner

Long time RapidMiner Studio users already now about our seamless integration of Python, R, and RapidMiner. You just need to visit the Marketplace and download the Python Scripting and R Scripting operators and restart RapidMiner Studio.  Once you do, you should see two new operators under the Utility > Scripting directory in you Operator list.

executepyandr

Using those operators is a snap if already have Python and/or R script ready to go. All you need to do now is drag in those operators and embed your Python or R code and run your process. The nice thing about this is that you can go from RapidMiner to Python, then to R, and back again.  Below is a example process that uses the Finance & Economics extension to download some S&P500 (^GSPC) data and then uses an R script to calculate the ARIMA point forecast.  You can see how easily the R Scripting operator mixes with native RapidMiner Studio operators (Rename) and even other Studio extensions.

arima-forecast

 

Working these operators is quite easy, but there a few things you need to be aware of so you don’t pull out your hair. Usually it’s as easy as clipping your R script (or Python script for that matter) into the “{…}” of the function. You can also have multiple inputs and output results to these operators by adding and returning more functions. If you want to pass more inputs to a script, all you need to do is something like this:

rm_main = function (model, data)

{ my awesome R script …

return (dataframe1, dataframe2)

}

Tip: Just make sure if you call an libraries in R (or import any modules in Python) that you have access and a path to them.

arima-r-code

Here’s the results.

point-forecast

 

You can try out this sample process by downloading this: gspc-arima-example. The one thing that we don’t show you here is how you can embed an R script (or Python for that matter) into an actual RapidMiner operator, like a Cross Validation.

Embedding Python and R inside RapidMiner operators

You can easily embed these scripts inside any RapidMiner operator that has a subprocess. This knowledge base article explains how you can use the RapidMiner Cross Validation operator and embed an R script algorithm to train and then test the model. You can also extend RapidMiner macros INTO your scripts and change them on the fly! This is cool because you can easily use the Optimize Parameters operator to optimize the output of your script!

It’s really easy to optimize your scripts if you use Set Macro operators. The screenshots below show the setting of the p, d, and q values for the ARIMA R script.

op-create-macros

Next, call the macros in your R Script:

op-r-script

 

Set the range of the paramters for the macros you want to optimize.

op-macros

Then hit run!  If you use a log operator you can export the AIC performance vs the different combinations of p, d, and q values. It’s that simple and you’ll love how easy it is use.

 

Leave a Comment