Snakes in space: Unleashing the power of Jupyter Notebook and Python in RapidMiner

Share on twitter
Share on facebook
Share on linkedin

I’m sure nobody likes the idea of meeting snakes on a plane, never mind snakes on a rocket ship*. But there are a good number of folks who get excited when they see Python code in a data science project, which is why we’re giving Python a stratospheric boost in our latest RapidMiner release.

Because our mission at RapidMiner is to enable data-loving people of all skills to have a positive impact on their business, we wanted to grow into a data science platform that offers value for all these data loving people, including coders.

In this blog post, I’ll walk you through our thinking on how we added Jupyter Notebook functionality to the RapidMiner platform in our 9.6 release. If you’d like to see everything that changed in RapidMiner 9.6, including our browser-based auto ML solution RapidMiner Go, you can read Putting people at the center of AI: RapidMiner 9.6.

An inclusive and integrated AI platform

We believe that data science is a team sport, and most successful data science projects are multi-disciplinary, multi-department endeavors involving people with differing backgrounds and skillsets.

We also know that RapidMiner Studio’s visual paradigm is not for everyone. A lot of people working on data science projects, such as data engineers or data scientists, have a strong background in writing code and are most effective when working in a code-based paradigm.

To support these users, we added functionality to the RapidMiner platform that is useful for people who like using a non-linear, experimental platform to deliver value.

We wanted our platform to:

  • be inclusive, available to all users of the platform
  • be well-integrated with other parts of the platform
  • foster collaboration with other project members (of different skillsets)
  • feel familiar to users who have experience in writing code
  • provide a fast zero-to-awesome experience, while staying flexible and customizable
  • be easy to deploy and manage

Here’s what we ended up creating that I believe fulfills all the above needs.

Hubs, labs, and notebooks

With the 9.6 release of RapidMiner, we started shipping a bundled and configured JupyterHub instance with the RapidMiner platform deployment.

It is conveniently accessible from the RapidMiner AI Hub (formerly RapidMiner Server) user interface, and it boasts a Single Sign-On experience with AI Hub: just use the same credentials you would use for RapidMiner AI Hub to access it.

Once logged in, users are presented with a JupyterLab interface (those who prefer the plain old Jupyter Notebook can still access that interface) and can instantly start working on their code. We ship a pre-configured Python kernel which includes popular data science-related Python packages, and we also provide a way for users to create their own custom kernels for projects where something cutting-edge or custom is needed.

As for collaboration, with our (pre-installed) python-rapidminer library, the RapidMiner AI Hub repository is easily accessible. Users can read and write data stored in the AI Hub repository, execute RapidMiner processes and use their outputs in their code, and even store models and other custom Python objects.

A convenient notebook template will guide users through these above scenarios to get them started as quickly as possible and have them collaborating with others on their team.

What’s next?

I have three topics in mind (in no particular order) that the product team is going to focus on to improve this experience in the future:

  • Tighter integration with the RapidMiner repositories
  • Adding other notebook kernels (such as R)
  • Hassle-free, automated deployment of code created in Jupyter

Stay tuned for more in future RapidMiner releases.

Giving it a go

Hopefully this short summary above is enough to get you excited and give it a go. We are always eager to learn and there’s no better way than listening to our users and customers on what to improve. Feel free to reach out to me on LinkedIn and share your thoughts and ideas to make the product even better.

JupyterHub is available as part of our containerized RapidMiner platform deployment. You can either upgrade an existing deployment and add this new component to it or create a brand new one.

Check out our platform deployment documentation, where you can find handy deployment templates to get started quickly. Or just fire up an instance in AWS or Azure and start experimenting.

*No space snakes were harmed in the making of this release.

Additional Reading

Tamás Kenéz

Tamás Kenéz

Tamás is a product manager who's a nerd at heart. He has extensive product experience from Ericsson, LogMeIn, and Cloudera. Outside work, when he's not spending time with his family, he's busy applying the scientific method in the kitchen and whisky bars.