Get details on the latest RapidMiner platform enhancements
RapidMiner 9.10.3 advances your team’s ability to develop cutting-edge solutions for anomaly detection, image classification, computer vision & more.
RapidMiner’s latest release advances your team’s ability to develop cutting-edge solutions for anomaly detection, image classification, computer vision and much more. We’re releasing a brand-new streaming extension and have enhanced our deep learning extension to offer productivity benefits for coding data scientists while also making advanced techniques more accessible to non-coders. We’re excited to see how your teams leverage both updates to create impactful solutions.
Below is some additional context on each extension.
What’s New in RapidMiner 9.10.3?
- New streaming extension: RapidMiner’s new streaming extension combines the ease-of-use of modern data science platforms with the powerful analytical capabilities of streaming clusters. By using it, you can take any trained model that you’ve created in RapidMiner and push it into the hugely popular streaming components of Apache Flink & Spark—all without writing a line of code. Additionally, our Kafka Connector Extension now helps to detect available topics on clusters and provides more options for secure connections.
- Deep learning extension update: RapidMiner’s deep learning extension now has a built-in autoencoder, which allows users to reduce the dimensionality of their data and extract useful features faster than ever before. This opens up new possibilities for what you can use the extension for, including unsupervised learning problems and anomaly detection.
9.10 took important steps towards responsible AI while helping analytics teams accelerate time-to-value for streaming & IIOT use cases.
What’s New in RapidMiner 9.10?
- Bias detection & mitigation: Receive bias warnings in every part of the RapidMiner platform including Turbo Prep, Model Simulator and more. When Studio thinks you have a column that could lead to model bias, you’ll receive a warning along with an in-platform callout that explains what it was triggered by.
- Streaming & IIOT advancements: Mix and match RapidMiner with Python in low latency (50-100ms) use-cases, such as scoring large volumes of sensor data. Additionally, leverage a new function-fitting operator to fit data with custom functions when creating models for anomaly detection on devices, modeling physical behavior based on data, and more.
- Security enhancements: Support for Docker Rootless mode along with enhanced security in Kubernetes environments both raise our overall security standards. Security for containerized platforms is also improved through regular updates of Docker images with the newest secure components.
- Time series forecasting: Automate forecasting future values of univariate time series based on historical data in RapidMiner Go. Track advanced and seasonal trends when forecasting sales or staffing requirements and use intuitive visualizations to compare the results of competing models.
- NLP extension: Leverage a new RapidMiner extension for natural language processing to extract part-of-speech tags and recognize people, cities, organizations, and other entities within free text. This is typically used as a pre-processing method to determine the contents of documents, website text, etc.
9.9 allows you to simplify complex challenges and get to value faster.
What’s New in RapidMiner 9.9?With new deep learning capabilities, enhanced Python integration, and significantly faster process execution, RapidMiner 9.9 enables you to tackle complex use cases and deliver business value faster than ever before.
- Further embrace Python: Code machine learning models and data transformation steps in Python, then easily share them as operators with the non-coders on your team so they can benefit from your work. Additionally, reuse centrally governed data connections by accessing connection properties through a new API in our Python library.
- Deliver value faster: Use visual transfer learning to accelerate time-to-value for advanced use cases. Reuse past models to jumpstart new projects, or start with well-known deep learning models for image classification, facial recognition etc. Additionally, run processes up to 60x faster with our new data processing engine that is optimized for blazingly fast in-memory computation of analytics workloads.
- Overcome barriers at the edge: Easily collaborate on edge-based use cases, jointly develop and deploy projects and integrate Python code or models as needed. Use the new ‘continuous mode’ of real-time scoring nodes to integrate model inference into streaming or edge computing use cases and connect them to event streaming platforms like Apache Kafka.
RapidMiner Go makes it even easier to explore insights with your team and communicate the results of your models.
The new shareable model simulator feature in RapidMiner Go creates a link to an interactive interface with adjustable inputs and real-time view of predicted outcomes. It displays predictions, confidences and explanations for those inputs.
- Instantly share an interactive simulator for your model with stakeholders
- Allow stakeholders to experiment with your model and discover how it works
- See how models could perform in the real-world
- Generate cross-organizational interest in your projects
- Foster meaningful conversations about putting your models into production
We’ve also improved our brand new auto-clustering capability, by making it easier to share and manage your analyses and improving the overall user experience.
9.8 continued to innovate in data science collaboration, connectivity & governance.
With our latest 9.8 release, RapidMiner now offers:
- Advanced AI governance and collaboration in the AI Hub: Optimize performance when working with large (or very large!) files in projects and effortlessly exchange content between projects – all in effort to make it easier to promote development versions to production environments.Additionally, queue owners can now put one user in charge of the execution environment to better fit the needs of their organization.
- Enhanced connectivity in enterprise environments: Simplified proxy setup with automatic configuration of settings (based on OS settings or PAC files), improved Radoop proxy to streamline connectivity to Hadoop cluster (including support for KMS or Timeline Service traffic), support for connections to utilize external data in real-time use cases on the edge (using Real-Time Scoring nodes) and for MS-SQL in-database processing.
- Continued investments into data science innovations and governance: Fully centralized Python and R coding environment management with the AI Hub (now including kernels inside Notebooks), fast and efficient K-Means clustering based on H2O’s implementation, several image mining and deep learning improvements such as a UI to monitor model training, ready-made Docker environments, and new operators and advanced (time-based) windowing functionality for time series operators.
- Branding customization of RapidMiner Go for OEM customers: Create a familiar and seamless branded experience for users with customizable UI elements. Customize the interface with your company colors, as well as a custom application name and URL.
9.7 continued to put people at the center of the AI journey by fostering better collaboration without sacrificing governance.
RapidMiner Server is now the RapidMiner AI Hub
- New project-based repository offering unprecedented collaboration and governance for AI: Diverse teams can work together on use cases in a central location across automated, visual and code-based authoring styles. Enterprises can easily and iteratively convert ideas into business impact.
- Projects includes fine-grained version control based on git standards: Version-controlled projects enable iterative and collaborative, yet governed, development of AI, delivering a rare combination of agility and traceability in your AI development lifecycles.
- Enterprise-grade identity & access management: To support collaboration at large scale, we’ve introduced a new identity and access management framework, including Single Sign-On to create a consistent experience across the platform (Go, Studio and AI Hub, as well as JupyterHub, our platform admin tools, and our interactive dashboarding capabilities).
RapidMiner AI Hub
Central collaboration and governance
The new projects with universal data storage (ie storage of any file type) drive seamless collaboration across different AI authoring styles (automated, visual and code-based). Git-based version control tracks all changes as “snapshots” allowing users to easily “roll back” which enables smoother collaboration and conflict resolution.
Identity and access management
RapidMiner AI Hub now ships with a new identity provider component based on the open source component Keycloak. It provides precise access controls by user, group and role, and a seamless single-sign on experience across the platform. It also integrates with all common identity providers, user databases and more.
RapidMiner Notebooks, the JupyterLab interface in the RapidMiner platform, is now fully integrated with the new projects framework, allowing for seamless and easy collaboration of coders with other users as and full traceability of the code-based work within the AI lifecycle.
The RapidMiner AI Hub now features a new dashboard providing insight into what’s going on in your system, including: executed and failed jobs, disk usage of projects, configured schedules, web services, and more.
RapidMiner Studio has been enhanced to support the use of the projects framework that’s new to the RapidMiner platform.
New file format for high-performance
Studio now leverages HDF5 for data storage, which enhances stability and performance with large amounts of data. It also enables easier and faster data exchange with Python.
Improvements to augmented machine learning
Auto Model reduces memory usage and run times and allows multiple Auto Model jobs to be submitted to the AI Hub at once.
Model Ops offers flexible model storage options for deployed models and unused and ID columns are now kept in the results after scoring for enhanced audits.
Updated H2O library
Enhances performance, exposes weights for deep learning and more.
We’ve added a number of new enhancements to Radoop that improve convenience and ease of use.
9.6 aimed to put people at the center of the AI journey by making ML accessible to anyone and driving collaboration between people of different backgrounds & preferences.
A brand new, fully automated and guided offering, built for users with minimal data science experience. All you need is a data set and a few minutes. RapidMiner Go is accessible through a browser, so there’s no need to download anything or use local machine resources. RapidMiner Go is tightly integrated with the rest of the platform so business users can prototype models and collaborate with more experienced users to get the models into production.
RapidMiner Studio 9.6
Model Ops works with any models
Model Ops is now the go-to way for your entire analytics team to universally manage and operate all of your models across your enterprise, no matter how they were created. We support custom-created models, code-based models and even complex models with multiple pre-processing steps.
Time series and time zone enhancements
We’ve solved some common challenges with using time-based data, including brand new operators that improve time series-specific data transformation and feature extraction.
RapidMiner Server 9.6
Code in the RapidMiner platform with JupyterHub
We’ve made it easier for full-time coders to work more collaboratively with non-coders by co-deploying JupyterHub with RapidMiner Server. This includes SSO and easy connection to the repository.
Easily create interactive dashboards and visualizations with your results
We’ve integrated popular open-source dashboarding technology that’s easy to use and offers a wide variety of visualizations.
Stop all the jobs
A new “clean-up” button can reset the Server queue and stop all running and pending jobs, leaving the Server in a clean state, ready to start anew.
RapidMiner Radoop 9.6
Enterprise-ready experience enhancements
We’ve added a number of new enhancements to Radoop that improve the experience, especially in larger enterprise environments.
Time series and time zone enhancements
We’ve solved some common challenges with using time-based data, including some brand new operators that improve time series-specific data transformation and feature extraction.
9.5 was all about optimizing your administrative experience. Whether you’re upgrading your deployment, managing a Docker-based deployment, or connecting to the latest version of Hadoop, you'll appreciate these new enhancements.
RapidMiner Studio 9.5
Upgrade RapidMiner Studio independently from Server
Connect to and access data and processes on older Server versions (9.0 or above) with any current or future Studio version. The latest Studio release will verify executability of processes stored on Server.
RapidMiner Server 9.5
Instant execution in Server
We have drastically improved latency in Server executions. Run short jobs 10x faster than before!
Easily manage Docker-based RapidMiner installations
The new RapidMiner Docker Deployment Manager provides an easy-to-use web interface to scale the number of JCs in your environment, re-configure or restart any component without specialized experience.
Full control of SLAs on queues – limit the lifetime of jobs
Control your job queues and make sure your Server time is spent in the best way. Now you can define a maximum execution time for jobs, so that, if jobs fail to finish on time, they won’t saturate the queues and affect the SLAs defined for others. It’s now easier to create reliable execution windows in shared environments.
Server settings via API
Fully automate Server deployments by programmatically editing the settings to your chosen values.
RapidMiner Radoop 9.5
Radoop Proxy for Hadoop 3
We have enhanced Radoop Proxy to work seamlessly with clusters based on Hadoop 3 (such as Cloudera CDH 6.x or HDP 3.x). This means that working with the latest version of Hadoop is easier and more secure.
Revamped general and connection-level settings
To make Radoop more user-friendly, we moved most of the settings from the RapidMiner Studio Preferences to Radoop connections.
9.4 was all about deployment. It’s easy for anyone to deploy and manage models, and we’ve streamlined platform deployment with the RapidMiner AI Cloud.
RapidMiner Studio 9.4
Automated Model Ops
An easy way for Auto Model users of any skill level to deploy models into production. You can analyze performance of multiple models per use case, monitor governance issues and even swap in better performing models. When combined with Turbo Prep and Auto Model, this creates a complete path to fully automated data science. Read more here.
Enhancements to Auto Model, including Profit-Sensitive Scoring
We’ve added a unique capability which allows business users to input cost and revenue variables so models will self-optimize for profitability. We’ve also created a new, clean UI so it’s easy to understand how models were built.
New Data Prep and Modeling Operators
Easily clean your data for modeling and scoring with new operators such as: Replace All Missings, Handle Unknown Values, One Hot Encoding and Append (Robust). You can also rescale confidences – even for classification with more than two classes.
Enhanced Visualizations & New Charts
Help you tell a compelling and intuitive story about your data and your models.
Time Series Enhancements
We’ve added a number of new features that make it easier to create and validate accurate time series forecasting.
RapidMiner Server 9.4
Managed Offerings in the RapidMiner AI Cloud
Users can deploy models into production without heavy lifting from IT. Forget about managing RapidMiner on your own, our highly trained DevOps team can do it for you in our cloud environment.
Real-Time Scoring Now Available As Cloud VMs
Provides easy, fast, scalable deployment in the cloud. Find the new VMs in the AWS and MS Azure marketplaces!
Auto Model Web
A new browser-based version of the proprietary RapidMiner Auto Model technology, built for business users who know their data and use case, but don’t have advanced data science background (requires RapidMiner Server).
Improved Job Management
Makes it easier to remove old jobs and prevent overuse of server resources by other users.
RapidMiner Radoop 9.4
Flexible Hadoop Governance
Use any SAML-based enterprise SSO to access Hadoop via the Radoop Proxy. Leverage the new connection framework for connecting to Hadoop via the Radoop Proxy.
9.3 made it easier for beginners and experts to work together, speeds up model training, improves time series analysis, and more.
RapidMiner Studio 9.3
RapidMiner Python: drive collaboration between beginners and experts
Why choose between Python and RapidMiner? Augment your RapidMiner toolset with anything you can do in Python. Enhanced integration of RapidMiner and Python means you can select the best approach for the task at hand and enable collaboration of RapidMiner users working code-free and data scientists coding in Python.
Scale Auto Model in RapidMiner Studio with RapidMiner Server
We’ve given RapidMiner Studio the ability to scale automated ML to large data sets, leveraging the compute power of RapidMiner Server. Auto Model training will be much faster because you can run multiple algorithms in parallel on the distributed architecture of the RapidMiner Server platform and free up Studio to work on other tasks while the Server handles calculations.
Improved time series analysis and forecasting
RapidMiner now offers more operators to make it easier to perform time series analysis and forecasting. You can calculate a baseline forecasting performance as a benchmark for future analyses, understand auto correlation of time series and discover hidden patterns and even improve forecasting performance by isolating trend and seasonal components of time series data.
RapidMiner Server 9.3
Data connectivity at scale
RapidMiner Server now makes it easier create and manage a re-usable data pipeline that the whole organization can benefit from – with security and policy controls to ensure you don’t jeopardize data governance.
Enterprise user authentication
Implement RapidMiner Server with your preferred identity providers and security policies. RapidMiner Server leverages standard (SAML 2.0) security protocols to integrate seamlessly and supports password, token or multi-factor authentication.
9.2 introduced new charts and visualizations, text analytics for RapidMiner Auto Model, easier RapidMiner Server updates, and more.
RapidMiner Studio 9.2
Text Analytics for RapidMiner Auto Model
Automatically extract features and categorize text content with built-in sentiment analysis and language detection. The new enhanced Auto Model automatically identifies, preprocesses and incorporates textual data for modeling. Leverage text content for prediction or clustering, run pre-built sentiment analysis or language detection and gain insights from text-specific visualizations.
Leverage Automatic Feature Engineering for clustering, improved extraction of features from date columns and the now incorporated Fast Large Margin and Multiclass Logistic Regression learners.
New charts and visualizations
We completely rewrote over 30 charts and visualizations including scatter plots, histograms, parallel coordinates, box plots, word clouds, and many more. The new charts provide rich customization options for colors, plot types, legends and axes. etc. Easily save and load chart configurations and export to various file formats.
Enhanced Time Series capabilities
Extend your time series modeling toolbox with new operators, which can provide exponential smoothing, allow you to calculate a lagged time series, or extract polynominal fit coefficients.
RapidMiner Server 9.2
Easier and automated upgrade
Incorporate new features and improvements fast and reliably. You can simply select upgrade in the installer and point to your existing home directory of RapidMiner Server. It will keep your old configuration and migrate if necessary. This is available for migrations starting from version 9.1 onwards.
Centrally managed Job Agent resources
Keep your RapidMiner Server nodes consistent and easy to configure by centrally deploying files like: Extensions, JDBC drivers, Licenses, custom libraries, and Execution context.
Deploy those files centrally within the RapidMiner Server home directory and they automatically get distributed among all connected Job Agents. Distribution will start when RapidMiner Server is (re)started or if licenses are updated in the web interface. There’s also a REST API available to trigger synchronization.
Large file management in the repository
Upload larger files (>2 Gb) as there is now no size limitation. It‘s easier to test your processes with real data or use the repository as a the container of any input data.
9.1 introduced Automatic Feature Engineering, In-Database Processing, enhanced Time Series capabilities, high availability support for RapidMiner Server, and more.
RapidMiner Studio 9.1
Automatic Feature Selection and Engineering
Feature engineering is often the difference between good and great models. This release introduces a unique way to automatically select and generate features that balances model error with model complexity in RapidMiner Auto Model or by using the new Automatic Feature Engineering operator in a RapidMiner process.
Run data prep and ETL workflows from RapidMiner Studio directly inside your database. Simply design your workflow in Studio and we’ll convert it to SQL for you. This is especially important for cloud databases like Google BigQuery where you have to pay for the amount of data you query. We support MySQL, PostgreSQL, and Google BigQuery with more databases to come in the future.
Tackle the complexity of time series data
Understand trends and seasonality using the new time series decomposition operators. You can forecast with the Holt-Winters method and process nominal time series data with the new Windowing, Process Windows, and Replace Missing Values operators.
Read and apply Keras model on your dataset without touching Python.
Better Python and R integration
Link directly to existing R and Python files instead of having to cut + paste them into RapidMiner Studio. Support for Anaconda Python distribution was also added.
RapidMiner Server 9.1
Minimize downtime for your mission-critical applications through new support for multi-server active-active configurations.
Automate the scheduling of your processes
Integrate RapidMiner processes with external applications using the new and modern scheduler REST API.
RapidMiner Radoop 9.1
Support for HDP 3 and Cloudera 6
Now offering support for Hortonworks HDP 3, allowing you to take advantage of the amazing features like storage-saving erasure coding. This release also comes with an initial support for Cloudera 6 (limited to certain configurations).
9.0 introduced RapidMiner Turbo Prep – a new data preparation experience inside RapidMiner Studio, improved time series modeling & forecasting, enhanced security and governance for enterprise deployments, and more.
RapidMiner Studio 9.0
RapidMiner Turbo Prep
Spend less of your precious time preparing data. Don’t let yourself get slowed down by clunky data prep tools or by not having a whole lot of data science expertise yet. Use the new RapidMiner Turbo Prep to easily transform, pivot and blend data from multiple sources with a few clicks while instantly seeing the impact of your actions on the data.
- Point and click: Intuitively interact with the data and immediately see how changes impact results.
- Blend, wrangle, and cleanse: Easily blend and join data from a variety of sources including relational databases, NoSQL, APIs, spreadsheets, applications, social media, and more. Quickly extract, join, filter, group, pivot, transform and cleanse your data.
- Re-use and share: Create repeatable data prep processes to save time. When you are finished, send your data directly to RapidMiner Studio or Auto Model for model creation, save your data as Excel or CSV or publish it to data visualization products like Qlik.
Tame the complexity of time series data in challenging use cases like demand forecasting or predictive maintenance. Analyze time series data with the new, now built-in time series modelling & forecasting capabilities: Forecast data using ARIMA or any Machine Learning based prediction model, cleanse your time series data by interpolating missing values or applying moving average filters, apply transformations like windowing or a fast Fourier transform (FFT) or perform feature extraction.
Admin control over settings and preferences
Govern product usage for analytics teams by putting guardrails on how RapidMiner Studio can be used, mitigating the risk of misuse. Pre-configure RapidMiner Studio installations within your organization and enforce important settings and preferences such as policies on password storage, extensions and operators available for use, and proxy settings.
Access to data on Google Cloud Storage
Easily access data on Google Cloud Storage using the new Read Google Storage, Write Google Storage, and Loop Google Storage operators which now accompany their Amazon S3 and Azure Blob Storage counterparts.
Pre-connected training and community repositories
Browse and learn from curated training resources or get inspired by sample content provided by community members, both of which are directly available through pre-connected repositories in RapidMiner Studio.
RapidMiner Server 9.0
Scalable repository for sharing data, processes and models
Manage your data science artifacts at scale – ready to grow to any demand from expanding analytics teams. Work with more and larger files and swiftly move folders around in RapidMiner Server’s new file-based repository. Speed has been improved by more than 10x.
Enhanced security for Enterprise deployments
Rely on full enterprise-grade security including password encryption, protection against Server-Side Forgery attacks, better session control and file-upload control.
Support for MySQL 8
Leverage the latest MySQL version 8.x as a configuration database for RapidMiner Server.
RapidMiner Radoop 9.0
Find the needle in the haystack. Identify fraud, detect abnormal consumer or machine behavior or spot rare but interesting facts by detecting outliers in your data on Hadoop with the new Anomaly Detection operator implementing the Isolation Forest algorithm.
Solve time series use cases like forecasting, predictive maintenance and others directly in Hadoop. Use the new windowing operator to easily restructure your time series data residing in Hadoop in a way that can be understood by prediction, clustering and outlier detection algorithms.
Reduce overfitting of machine learning models and improve their predictive performance by preparing data with the new discretization operators in RapidMiner Radoop. Binning or bucketing techniques are useful whenever the exact number representing a value is not meaningful and only adds noise and they are a great tool to prepare data for algorithms such as decision trees.
Support for MapR 6
Easily analyze data and use all the predictive power of RapidMiner Radoop in a Hadoop MapR 6 environment.