I’m sure nobody likes the idea of meeting snakes on a plane, never mind snakes on a rocket ship*. But there are a good number of folks who get excited when they see Python code in a data science project, which is why we’re giving Python a stratospheric boost in our latest RapidMiner release.
Because our mission at RapidMiner is to enable data-loving people of all skills to have a positive impact on their business, we wanted to grow into a data science platform that offers value for all these data loving people, including coders.
In this blog post, I’ll walk you through our thinking on how we added Jupyter Notebook functionality to the RapidMiner platform in our 9.6 release. If you’d like to see everything that changed in RapidMiner 9.6, including our browser-based auto ML solution RapidMiner Go, you can read Putting people at the center of AI: RapidMiner 9.6.
An inclusive and integrated AI platform
We believe that data science is a team sport, and most successful data science projects are multi-disciplinary, multi-department endeavors involving people with differing backgrounds and skillsets.
We also know that RapidMiner Studio’s visual paradigm is not for everyone. A lot of people working on data science projects, such as data engineers or data scientists, have a strong background in writing code and are most effective when working in a code-based paradigm.
To support these users, we added functionality to the RapidMiner platform that is useful for people who like using a non-linear, experimental platform to deliver value.
We wanted our platform to:
- be inclusive, available to all users of the platform
- be well-integrated with other parts of the platform
- foster collaboration with other project members (of different skillsets)
- feel familiar to users who have experience in writing code
- provide a fast zero-to-awesome experience, while staying flexible and customizable
- be easy to deploy and manage
Here’s what we ended up creating that I believe fulfills all the above needs.
Hubs, labs, and notebooks
With the 9.6 release of RapidMiner, we started shipping a bundled and configured JupyterHub instance with the RapidMiner platform deployment.
It is conveniently accessible from the RapidMiner Server user interface, and it boasts a Single Sign-On experience with Server: just use the same credentials you would use for RapidMiner Server to access it.
Once logged in, users are presented with a JupyterLab interface (those who prefer the plain old Jupyter Notebooks can still access that interface) and can instantly start working on their code. We ship a pre-configured Python kernel which includes popular data science-related Python packages, and we also provide a way for users to create their own custom kernels for projects where something cutting-edge or custom is needed.
As for collaboration, with our (pre-installed) python-rapidminer library, the RapidMiner Server repository is easily accessible. Users can read and write data stored in the Server repository, execute RapidMiner processes and use their outputs in their code, and even store models and other custom Python objects.
A convenient notebook template will guide users through these above scenarios to get them started as quickly as possible and have them collaborating with others on their team.
I have three topics in mind (in no particular order) that the product team is going to focus on to improve this experience in the future:
- Tighter integration with the RapidMiner repositories
- Adding other notebook kernels (such as R)
- Hassle-free, automated deployment of code created in Jupyter
Stay tuned for more in future RapidMiner releases.
Giving it a go
Hopefully this short summary above is enough to get you excited and give it a go. We are always eager to learn and there’s no better way than listening to our users and customers on what to improve. Feel free to reach out to me on LinkedIn and share your thoughts and ideas to make the product even better.
JupyterHub is available as part of our containerized RapidMiner platform deployment. You can either upgrade an existing deployment and add this new component to it or create a brand new one.
With our latest release, we’re letting anyone shape the future for the better, regardless of their background or skillset. Check out the highlights in this blog post.
The question isn’t RapidMiner vs R, it’s how to use them together. Learn tips and tricks for using RapidMiner with Python and R.