Tips & Tricks: Using RapidMiner Server (Now RapidMiner AI Hub)

Share on twitter
Share on facebook
Share on linkedin

Are you a RapidMiner Studio user? Do you have models you want to operationalize? Do you need to score some incoming data and publish it out to your BI team? Does your boss want a stream of fraud predictions

Well there’s a solution to these problems: Use RapidMiner Server (now RapidMiner AI Hub)!

Before we get started, take a few minutes to check out this video on some important functions of RapidMiner AI Hub.


Let’s take a look at connections to your data via RapidMiner. RapidMiner AI Hub allows users to have available connections on the AI Hub viewable from RapidMiner Studio.

A few connections are: Twitter API, SQL database, commercial databases, salesforce, Dropbox, Hadoop, Amazon cloud, and many more. All of these connections can be established locally, but how does that help with working as a team? It doesn’t.

Luckily with RapidMiner AI Hub we can configure these connections for everyone’s use. From the AI Hub browser we can quickly add in any connections we want available to our users. This greatly expands data access for the multitude of teams that can be working together on the AI Hub. Once the connections are established they show up in RapidMiner Studio for the individual to take advantage of.

2016-08-18 08_16_28-Manage Connections

2016-08-18 08_18_13-Manage Database Connections


RapidMiner AI Hub admins have access to security rights for all users. From the AI Hub browser online, admins can grant view, change, and execute rights to users. You can quickly look through folders on the AI Hub by opening the repository folder and moving through each of the folders. As an admin, I would have the availability to group our users.

We mentioned above, the case where the data scientist’s team needs to push out to a business analyst team. Well, let’s take a look at this case from the RapidMiner AI Hub perspective. Here you can make the workflows, models and data, viewable and executable by the BI team, however, the data science team has the editing rights.

Here we can achieve perfect integration of your data scientists work into the daily workflow for your business analyst through a single platform. Along with this you can track the processes that have been run on the AI Hub through the browser. This allows the admins to track use and moderate the process schedules.

2016-08-18 08_19_32-Server Repository

AI Hub Repository folder. Notice the folders by username. Each user starts with a folder on creation of their account. Any new folders added have their own custom authorization rights.

2016-08-18 08_20_57-Edit Access Rights_ __Sales Server_home_jchowaniec_Access_Rights

Notice the different authorization options available to this user.

2016-08-18 08_21_55-RapidMiner Server_Process_Scheduler

Process Scheduler on RapidMiner AI Hub. From this view in the browser, you can check to see who is caching jobs onto the AI Hub and navigate to those processes to see what it is they are scheduling.


Now that we’ve show the group work side of AI Hub, let’s actually do some data science. A brief background into the data and the use case is as follows. We are in charge of detecting trade fraud. The only avenue for doing so is but blasting our way through company emails. The details of how we do this are in the photo. There’s a few operators there but basically we leverage some text processing and an SVM to predict positive or negative.

2016-08-18 08_26_09-Process_Model_Generation

Now we’ve already done the heavy lifting and we are confident that our SVM can detect trade fraud emails. Now let us simulate a real time input of fresh emails. So what we have done here is generated a dummy email. That email reads as follows, “I have some secret information for you, don’t tell anyone, Ingo.”

We could create this email and push it through the model and deploy it on the AI Hub via this screenshot, however, since our employees send hundreds of thousands of emails every day, let us set this up to push all new emails through our fraud detector in real time.

2016-08-18 08_27_54-Webservice

WEB SERVICE. If we take our lazy deployment process and add a few tweaks (mostly just adding in Ingo’s email). We can now build this out as web service and we can establish a feed to constantly be sending new emails through our fraud detector via the data source of our choosing. The power is literally at your fingertips. 

Let us navigate to the web service view.

2016-08-18 08_29_01-RapidMiner Server_WebserviceView

2016-08-18 08_30_05-RapidMiner Server_BuildWebservice

Here we just select the process we want to publish as a web service and make sure all, if any, options are running smoothly. Now if we go back and hit the cog we will see the web service in action and ultimately we have a published web service for our fraud detector.

2016-08-18 08_31_15-RapidMiner Server_WebserviceOutput