What’s New

RapidMiner 9.5 is all about optimizing your administrative experience. Whether you’re upgrading your deployment, managing a Docker-based deployment, or connecting to the latest version of Hadoop, you will appreciate these new enhancements.

RapidMiner Studio 9.5

Upgrade RapidMiner Studio independently from Server

Connect to and access data and processes on older Server versions (9.0 or above) with any current or future Studio version. The latest Studio release will verify executability of processes stored on Server. 

RapidMiner Server 9.5

Instant execution in Server

We have drastically improved latency in Server executions. Run short jobs 10x faster than before!

Easily manage Docker-based RapidMiner installations

The new RapidMiner Docker Deployment Manager provides an easy-to-use web interface to scale the number of JCs in your environment, re-configure or restart any component without specialized experience.

Full control of SLAs on queues – limit the lifetime of jobs

Control your job queues and make sure your Server time is spent in the best way. Now you can define a maximum execution time for jobs, so that, if jobs fail to finish on time, they won’t saturate the queues and affect the SLAs defined for others. It’s now easier to create reliable execution windows in shared environments.

Server settings via API

Fully automate Server deployments by programmatically editing the settings to your chosen values. 

RapidMiner Radoop 9.5

Radoop Proxy for Hadoop 3

We have enhanced Radoop Proxy to work seamlessly with clusters based on Hadoop 3 (such as Cloudera CDH 6.x or HDP 3.x). This means that working with the latest version of Hadoop is easier and more secure.

Revamped general and connection-level settings

To make Radoop more user-friendly, we moved most of the settings from the RapidMiner Studio Preferences to Radoop connections.

The theme of 9.4 is deployment. It’s easy for anyone to deploy and manage models, and we’ve streamlined platform deployment with the RapidMiner AI Cloud.

RapidMiner Studio 9.4

Automated Model Ops

An easy way for Auto Model users of any skill level to deploy models into production. You can analyze performance of multiple models per use case, monitor governance issues and even swap in better performing models. When combined with Turbo Prep and Auto Model, this creates a complete path to fully automated data science. Read more here.

Enhancements to Auto Model, including Profit-Sensitive Scoring

We’ve added a unique capability which allows business users to input cost and revenue variables so models will self-optimize for profitability. We’ve also created a new, clean UI so it’s easy to understand how models were built.

New Data Prep and Modeling Operators

Easily clean your data for modeling and scoring with new operators such as: Replace All Missings, Handle Unknown Values, One Hot Encoding and Append (Robust). You can also rescale confidences – even for classification with more than two classes.

Enhanced Visualizations & New Charts

Help you tell a compelling and intuitive story about your data and your models.

Time Series Enhancements

We’ve added a number of new features that make it easier to create and validate accurate time series forecasting.

RapidMiner Server 9.4

Managed Offerings in the RapidMiner AI Cloud

Users can deploy models into production without heavy lifting from IT. Forget about managing RapidMiner on your own, our highly trained DevOps team can do it for you in our cloud environment.

Real-Time Scoring Now Available As Cloud VMs

Provides easy, fast, scalable deployment in the cloud. Find the new VMs in the AWS and MS Azure marketplaces!

Auto Model Web

A new browser-based version of the proprietary RapidMiner Auto Model technology, built for business users who know their data and use case, but don’t have advanced data science background (requires RapidMiner Server).

Improved Job Management

Makes it easier to remove old jobs and prevent overuse of server resources by other users.

Enhanced data source connectivity

Connections are easier with a new data connection framework, including simplified connections to sources such as:

RapidMiner Radoop 9.4

Flexible Hadoop Governance

Use any SAML-based enterprise SSO to access Hadoop via the Radoop Proxy. Leverage the new connection framework for connecting to Hadoop via the Radoop Proxy.

RapidMiner 9.3 makes it easier for beginners and experts to work together, speeds up model training, improves time series analysis, and more.

RapidMiner Studio 9.3

RapidMiner Python: drive collaboration between beginners and experts

Why choose between Python and RapidMiner? Augment your RapidMiner toolset with anything you can do in Python. Enhanced integration of RapidMiner and Python means you can select the best approach for the task at hand and enable collaboration of RapidMiner users working code-free and data scientists coding in Python.

Scale Auto Model in RapidMiner Studio with RapidMiner Server 

We’ve given RapidMiner Studio the ability to scale automated ML to large data sets, leveraging the compute power of RapidMiner Server. Auto Model training will be much faster because you can run multiple algorithms in parallel on the distributed architecture of the RapidMiner Server platform and free up Studio to work on other tasks while the Server handles calculations. 

Improved time series analysis and forecasting 

RapidMiner now offers more operators to make it easier to perform time series analysis and forecasting. You can calculate a baseline forecasting performance as a benchmark for future analyses, understand auto correlation of time series and discover hidden patterns and even improve forecasting performance by isolating trend and seasonal components of time series data.

RapidMiner Server 9.3

Data connectivity at scale 

RapidMiner Server now makes it easier create and manage a re-usable data pipeline that the whole organization can benefit from – with security and policy controls to ensure you don’t jeopardize data governance.

Enterprise user authentication 

Implement RapidMiner Server with your preferred identity providers and security policies. RapidMiner Server leverages standard (SAML 2.0) security protocols to integrate seamlessly and supports password, token or multi-factor authentication.

RapidMiner 9.2 introduced new charts and visualizations, text analytics for RapidMiner Auto Model, easier RapidMiner Server updates, and more.

RapidMiner Studio 9.2

Text Analytics for RapidMiner Auto Model

Automatically extract features and categorize text content with built-in sentiment analysis and language detection. The new enhanced Auto Model automatically identifies, preprocesses and incorporates textual data for modeling. Leverage text content for prediction or clustering, run pre-built sentiment analysis or language detection and gain insights from text-specific visualizations.

Leverage Automatic Feature Engineering for clustering, improved extraction of features from date columns and the now incorporated Fast Large Margin and Multiclass Logistic Regression learners.

New charts and visualizations

We completely rewrote over 30 charts and visualizations including scatter plots, histograms, parallel coordinates, box plots, word clouds, and many more. The new charts provide rich customization options for colors, plot types, legends and axes. etc. Easily save and load chart configurations and export to various file formats.

Enhanced Time Series capabilities

Extend your time series modeling toolbox with new operators, which can provide exponential smoothing, allow you to calculate a lagged time series, or extract polynominal fit coefficients.

RapidMiner Server 9.2

Easier and automated upgrade

Incorporate new features and improvements fast and reliably. You can simply select upgrade in the installer and point to your existing home directory of RapidMiner Server. It will keep your old configuration and migrate if necessary. This is available for migrations starting from version 9.1 onwards.

Centrally managed Job Agent resources

Keep your RapidMiner Server nodes consistent and easy to configure by centrally deploying files like: Extensions, JDBC drivers, Licenses, custom libraries, and Execution context.

Deploy those files centrally within the RapidMiner Server home directory and they automatically get distributed among all connected Job Agents. Distribution will start when RapidMiner Server is (re)started or if licenses are updated in the web interface. There’s also a REST API available to trigger synchronization.

Large file management in the repository

Upload larger files (>2 Gb) as there is now no size limitation. It‘s easier to test your processes with real data or use the repository as a the container of any input data.

RapidMiner 9.1 introduced Automatic Feature Engineering, In-Database Processing, enhanced Time Series capabilities, high availability support for RapidMiner Server, and more.

RapidMiner Studio 9.1

Automatic Feature Selection and Engineering

Feature engineering is often the difference between good and great models. This release introduces a unique way to automatically select and generate features that balances model error with model complexity in RapidMiner Auto Model or by using the new Automatic Feature Engineering operator in a RapidMiner process.

In-Database Processing

Run data prep and ETL workflows from RapidMiner Studio directly inside your database. Simply design your workflow in Studio and we’ll convert it to SQL for you. This is especially important for cloud databases like Google BigQuery where you have to pay for the amount of data you query. We support MySQL, PostgreSQL, and Google BigQuery with more databases to come in the future.

Download the extension

Tackle the complexity of time series data

Understand trends and seasonality using the new time series decomposition operators. You can forecast with the Holt-Winters method and process nominal time series data with the new Windowing, Process Windows, and Replace Missing Values operators.

Deep Learning

Read and apply Keras model on your dataset without touching Python.

Better Python and R integration

Link directly to existing R and Python files instead of having to cut + paste them into RapidMiner Studio. Support for Anaconda Python distribution was also added.

RapidMiner Server 9.1

High availability

Minimize downtime for your mission-critical applications through new support for multi-server active-active configurations.

Automate the scheduling of your processes

Integrate RapidMiner processes with external applications using the new and modern scheduler REST API.

RapidMiner Radoop 9.1

Support for HDP 3 and Cloudera 6

Now offering support for Hortonworks HDP 3, allowing you to take advantage of the amazing features like storage-saving erasure coding. This release also comes with an initial support for Cloudera 6 (limited to certain configurations).

 

RapidMiner 9.0 introduced RapidMiner Turbo Prep – a new data preparation experience inside RapidMiner Studio, improved time series modeling & forecasting, enhanced security and governance for enterprise deployments, and more.

RapidMiner Studio 9.0

RapidMiner Turbo Prep

Spend less of your precious time preparing data. Don’t let yourself get slowed down by clunky data prep tools or by not having a whole lot of data science expertise yet. Use the new RapidMiner Turbo Prep to easily transform, pivot and blend data from multiple sources with a few clicks while instantly seeing the impact of your actions on the data.

 
  • Point and click: Intuitively interact with the data and immediately see how changes impact results.
  • Blend, wrangle, and cleanse: Easily blend and join data from a variety of sources including relational databases, NoSQL, APIs, spreadsheets, applications, social media, and more. Quickly extract, join, filter, group, pivot, transform and cleanse your data.
  • Re-use and share: Create repeatable data prep processes to save time. When you are finished, send your data directly to RapidMiner Studio or Auto Model for model creation, save your data as Excel or CSV or publish it to data visualization products like Qlik.
 

Time series modeling & forecasting

 

Tame the complexity of time series data in challenging use cases like demand forecasting or predictive maintenance. Analyze time series data with the new, now built-in time series modelling & forecasting capabilities: Forecast data using ARIMA or any Machine Learning based prediction model, cleanse your time series data by interpolating missing values or applying moving average filters, apply transformations like windowing or a fast Fourier transform (FFT) or perform feature extraction.

Admin control over settings and preferences

Govern product usage for analytics teams by putting guardrails on how RapidMiner Studio can be used, mitigating the risk of misuse. Pre-configure RapidMiner Studio installations within your organization and enforce important settings and preferences such as policies on password storage, extensions and operators available for use, and proxy settings.

Access to data on Google Cloud Storage

Easily access data on Google Cloud Storage using the new Read Google Storage, Write Google Storage, and Loop Google Storage operators which now accompany their Amazon S3 and Azure Blob Storage counterparts.

Pre-connected training and community repositories

Browse and learn from curated training resources or get inspired by sample content provided by community members, both of which are directly available through pre-connected repositories in RapidMiner Studio.

RapidMiner Server 9.0

Scalable repository for sharing data, processes and models

Manage your data science artifacts at scale – ready to grow to any demand from expanding analytics teams. Work with more and larger files and swiftly move folders around in RapidMiner Server’s new file-based repository. Speed has been improved by more than 10x.

Enhanced security for Enterprise deployments

Rely on full enterprise-grade security including password encryption, protection against Server-Side Forgery attacks, better session control and file-upload control.

Support for MySQL 8

Leverage the latest MySQL version 8.x as a configuration database for RapidMiner Server.

RapidMiner Radoop 9.0

Anomaly detection

Find the needle in the haystack. Identify fraud, detect abnormal consumer or machine behavior or spot rare but interesting facts by detecting outliers in your data on Hadoop with the new Anomaly Detection operator implementing the Isolation Forest algorithm.

 

Windowing

Solve time series use cases like forecasting, predictive maintenance and others directly in Hadoop. Use the new windowing operator to easily restructure your time series data residing in Hadoop in a way that can be understood by prediction, clustering and outlier detection algorithms.

 

Discretization

Reduce overfitting of machine learning models and improve their predictive performance by preparing data with the new discretization operators in RapidMiner Radoop. Binning or bucketing techniques are useful whenever the exact number representing a value is not meaningful and only adds noise and they are a great tool to prepare data for algorithms such as decision trees.

 

Support for MapR 6

Easily analyze data and use all the predictive power of RapidMiner Radoop in a Hadoop MapR 6 environment.

RapidMiner 8.2 introduced Real-Time Scoring, a new back-end capability designed for lightning fast scoring, easy configuration of your Hadoop cluster, easier process management in server, UI enhancements in Studio and more.

RapidMiner Studio 8.2

Give feedback directly from Studio

We have introduced a way to gather feedback on RapidMiner Studio in the product. Share your thoughts, help us improve Studio to make you and many other data workers more productive.

Error messaging, GUI tweaks, bug fixes and more

  • We’ve started the process of making operators less prone to user errors and made the error messages more informative.
  • Meta operator UI was changed, so it’s easily distinguishable from regular operators.
  • Shortcuts introduced to boost process design.
  • We rebuilt the FP-Growth operator from scratch and achieved significant performance improvement. We measured from 5x up to 20x speed up on real world use cases.
  • And more…

RapidMiner Server 8.2

Real-Time Scoring

The new real-time scoring capability, is a back-end capability designed for use cases requiring very fast scoring like credit card fraud detection or real-time manufacturing controls.

  • Lighting Fast at Scale – Receive, score, and return a result in less than 20 ms and scale to process very large amounts of requests at the same time. So, not only does it score individual calls fast, but it can be deployed and scaled horizontally to handle 1000s of calls per second.
  • Simple and Easy Integration –  Once set up, you deploy your models and related process via a package produced on the RapidMiner Server. These packaged processes for scoring are then exposed as Web Services that can be called from any application.

Better process monitoring and list management

  • We have added several filters to the process list to make it more practical to use. No matter how long your process list, you can filter based on timeline, queue and duration.
  • The new Execution Details view displays the status of the process, operator by operator, allowing users to follow the executions and monitor their progress.

RapidMiner Radoop 8.2

Easier-than-ever configuration

All the Hadoop and Spark settings and variables can be automatically imported by Radoop, but should you ever need to change or tweak any of them, we have added a nice categorized editor that helps users and administrators to refine the connections and test each section independently.

RapidMiner 8.1 introduces Auto Model, a new addition to RapidMiner Studio that uses automated machine learning to accelerate everything data scientists do when building predictive models #noblackboxes. RapidMiner Server is more secure and faster than ever and RapidMiner Radoop adds new support for MapR. 

RapidMiner Studio 8.1

Auto Model, Global Search, new operators, performance upgrades, and more – February 2018

RapidMiner Auto Model: Accelerate Data Science Projects, without the Black Box

RapidMiner Auto Model accelerates the entire data science lifecycle using automated machine learning. It speeds data prep by analyzing data to identify common quality problems. It automates predictive modeling by suggesting the best machine learning techniques and then generating optimized, cross-validated predictive models.

 

RapidMiner Auto Model includes a new model simulator that delivers actionable insight for data science teams. It’s a simple visual interface where you can tune the individual parameters of a model to quickly explore how it performs under a variety of conditions.

Global Search: Everything is searchable, no more cumbersome folder structures

Now you can find anything within your repository and operator list using a central search engine, this includes all processes, models, operators, extensions… even your past actions!

No need to search through all our folder structure any more: everything is now at hand!

New Operators, improved performance, extensions showcase

  • We have re-factored a few operators, including Join, Correlation Matrix and K-Means to drastically improve performance, with up to x10 increases in speed
  • New Replace Rare Values operator, Merge operator, Improvements for the Parametric Probability Estimator operator and more

 

RapidMiner Server 8.1

Enterprise Data Science, faster and more secure – February 2018

Tightened security

  •  Enhanced protection against XXE attacks
  • Protection against malicious upload executables and other dangerous files 
  • Encryption of LDAP connection password in Server 
  • Encryption of passwords in Studio 
  • Admin is now a standard role in Server and multiple administrators can share responsibility 

Improved process monitoring

New job containers send additional information, you can see more details of the running processes in real-time: status of the process, percentage of the progress, the current running operator, and others

RapidMiner Radoop 8.1

Now supporting MapR Hadoop – February 2018

RapidMiner Radoop and the MapR Converged Data Platform now work together to make big data accessible for data scientists who don’t want to code on Hadoop and Spark.

RapidMiner 8.0 improves the reliability and scalability of the RapidMiner platform, enabling the production, deployment, and management of enterprise-scale data science projects.

RapidMiner Studio 8.0

New operators, better operator docs, Fuzzy search – December 2017

  • Parallelized optimization of model parameters
    • Optimize Parameters (Grid)
    • Loop Parameters
  • Easier to understand operator documentation
  • Fuzzy search allows to find operators more easily

RapidMiner Server 8.0

Highly scalable micro-services based architecture for enterprise Data Science Collaboration – December 2017

The new architecture will provide improved reliability and horizontal scalability, setting the stage for enterprise collaboration and resource management.

  • A new distributed architecture – delivering unlimited horizontal scalability
  • Containerized job executions – ensures operational continuity and system stability
  • Flexible and Enhanced configuration capabilities – dedicated resource management
  • An improved Server UI – providing a better experience for monitoring jobs, executions and more

RapidMiner Radoop 8.0

HiveContext Support, Cloudera Spark library upgrade and more

  • New K-Means operator is available that is based on the Spark MLib/ML clustering algorithm
  • HiveContext is now available in Spark Script, if the user has the appropriate privileges
  • And more….