RapidMiner 9.0 introduces RapidMiner Turbo Prep – a new data preparation experience inside RapidMiner Studio, improved time series modeling & forecasting, enhanced security and governance for Enterprise deployments, and more.

RapidMiner Studio 9.0

RapidMiner Turbo Prep

 

Spend less of your precious time preparing data. Don’t let yourself get slowed down by clunky data prep tools or by not having a whole lot of data science expertise yet. Use the new RapidMiner Turbo Prep to easily transform, pivot and blend data from multiple sources with a few clicks while instantly seeing the impact of your actions on the data.

  • Point and click: Intuitively interact with the data and immediately see how changes impact results.
  • Blend, wrangle, and cleanse: Easily blend and join data from a variety of sources including relational databases, NoSQL, APIs, spreadsheets, applications, social media, and more. Quickly extract, join, filter, group, pivot, transform and cleanse your data.
  • Re-use and share: Create repeatable data prep processes to save time. When you are finished, send your data directly to RapidMiner Studio or Auto Model for model creation, save your data as Excel or CSV or publish it to data visualization products like Qlik.

Time series modeling & forecasting

time-series-modeling-forecasting

Tame the complexity of time series data in challenging use cases like demand forecasting or predictive maintenance. Analyze time series data with the new, now built-in time series modelling & forecasting capabilities: Forecast data using ARIMA or any Machine Learning based prediction model, cleanse your time series data by interpolating missing values or applying moving average filters, apply transformations like windowing or a fast Fourier transform (FFT) or perform feature extraction.

Admin control over settings and preferences

Govern product usage for analytics teams by putting guardrails on how RapidMiner Studio can be used, mitigating the risk of misuse. Pre-configure RapidMiner Studio installations within your organization and enforce important settings and preferences such as policies on password storage, extensions and operators available for use, and proxy settings.

Access to data on Google Cloud Storage

Easily access data on Google Cloud Storage using the new Read Google Storage, Write Google Storage, and Loop Google Storage operators which now accompany their Amazon S3 and Azure Blob Storage counterparts.

access-to-data-on-google-cloud-storage

Pre-connected training and community repositories

Browse and learn from curated training resources or get inspired by sample content provided by community members, both of which are directly available through pre-connected repositories in RapidMiner Studio.

RapidMiner Server 9.0

Scalable repository for sharing data, processes and models

Manage your data science artifacts at scale – ready to grow to any demand from expanding analytics teams. Work with more and larger files and swiftly move folders around in RapidMiner Server’s new file-based repository. Speed has been improved by more than 10x.

Enhanced security for Enterprise deployments

Rely on full enterprise-grade security including password encryption, protection against Server-Side Forgery attacks, better session control and file-upload control.

Support for MySQL 8

Leverage the latest MySQL version 8.x as a configuration database for RapidMiner Server.

RapidMiner Radoop 9.0

Anomaly detection

anomaly-detection

Find the needle in the haystack. Identify fraud, detect abnormal consumer or machine behavior or spot rare but interesting facts by detecting outliers in your data on Hadoop with the new Anomaly Detection operator implementing the Isolation Forest algorithm.

Windowing

Solve time series use cases like forecasting, predictive maintenance and others directly in Hadoop. Use the new windowing operator to easily restructure your time series data residing in Hadoop in a way that can be understood by prediction, clustering and outlier detection algorithms.

windowing

Discretization

discretization

Reduce overfitting of machine learning models and improve their predictive performance by preparing data with the new discretization operators in RapidMiner Radoop. Binning or bucketing techniques are useful whenever the exact number representing a value is not meaningful and only adds noise and they are a great tool to prepare data for algorithms such as decision trees.

Support for MapR 6

mapr-logo

Easily analyze data and use all the predictive power of RapidMiner Radoop in a Hadoop MapR 6 environment.

RapidMiner 8.2 introduces Real-Time Scoring, a new back-end capability designed for lightning fast scoring, easy configuration of your Hadoop cluster, easier process management in server, UI enhancements in Studio and more.

RapidMiner Studio 8.2

Give feedback directly from Studio

We have introduced a way to gather feedback on RapidMiner Studio in the product. Share your thoughts, help us improve Studio to make you and many other data workers more productive.

feedback

Error messaging, GUI tweaks, bug fixes and more

  • We’ve started the process of making operators less prone to user errors and made the error messages more informative.
  • Meta operator UI was changed, so it’s easily distinguishable from regular operators.
  • Shortcuts introduced to boost process design.
  • We rebuilt the FP-Growth operator from scratch and achieved significant performance improvement. We measured from 5x up to 20x speed up on real world use cases.
  • And more…

RapidMiner Server 8.2

Real-Time Scoring

The new real-time scoring capability, is a back-end capability designed for use cases requiring very fast scoring like credit card fraud detection or real-time manufacturing controls.

  • Lighting Fast at Scale – Receive, score, and return a result in less than 20 ms and scale to process very large amounts of requests at the same time. So, not only does it score individual calls fast, but it can be deployed and scaled horizontally to handle 1000s of calls per second.
  • Simple and Easy Integration –  Once set up, you deploy your models and related process via a package produced on the RapidMiner Server. These packaged processes for scoring are then exposed as Web Services that can be called from any application.

Better process monitoring and list management

execution-details
executions

  • We have added several filters to the process list to make it more practical to use. No matter how long your process list, you can filter based on timeline, queue and duration.
  • The new Execution Details view displays the status of the process, operator by operator, allowing users to follow the executions and monitor their progress.

RapidMiner Radoop 8.2

Easier-than-ever configuration

All the Hadoop and Spark settings and variables can be automatically imported by Radoop, but should you ever need to change or tweak any of them, we have added a nice categorized editor that helps users and administrators to refine the connections and test each section independently.

hadoop-configuration-600

RapidMiner 8.1 introduces Auto Model, a new addition to RapidMiner Studio that uses automated machine learning to accelerate everything data scientists do when building predictive models #noblackboxes. RapidMiner Server is more secure and faster than ever and RapidMiner Radoop adds new support for MapR. 

RapidMiner Studio 8.1

Auto Model, Global Search, new operators, performance upgrades, and more  – February 2018

RapidMiner Auto Model: Accelerate Data Science Projects, without the Black Box

 

RapidMiner Auto Model accelerates the entire data science lifecycle using automated machine learning. It speeds data prep by analyzing data to identify common quality problems. It automates predictive modeling by suggesting the best machine learning techniques and then generating optimized, cross-validated predictive models.

RapidMiner Auto Model includes a new model simulator that delivers actionable insight for data science teams. It’s a simple visual interface where you can tune the individual parameters of a model to quickly explore how it performs under a variety of conditions.

model-simlator-8_1

Global Search: Everything is searchable, no more cumbersome folder structures

Now you can find anything within your repository and operator list using a central search engine, this includes all processes, models, operators, extensions… even your past actions!

No need to search through all our folder structure any more: everything is now at hand!

global-search-8_1

New Operators, improved performance, extensions showcase

  • We have re-factored a few operators, including Join, Correlation Matrix and K-Means to drastically improve performance, with up to x10 increases in speed
  • New Replace Rare Values operator, Merge operator, Improvements for the Parametric Probability Estimator operator and more

RapidMiner Server 8.1

Enterprise Data Science, faster and more secure – February 2018

Tightened security

  • Enhanced protection against XXE attacks
  • Protection against malicious upload of executables and other dangerous files
  • Encryption of LDAP connection password in Server
  • Encryption of passwords in Studio
  • Admin is now a standard role in Server and multiple administrators can share responsibility

Improved process monitoring

81-progress-indicator-normal

New job containers send additional information, you can see more details of the running processes in real-time: status of the process, percentage of the progress, the current running operator, and others

RapidMiner Radoop 8.1

Now supporting MapR Hadoop – February 2018

mapr-logo

RapidMiner Radoop and the MapR Converged Data Platform now work together to make big data accessible for data scientists who don’t want to code on Hadoop and Spark.

RapidMiner 8.0 improves the reliability and scalability of the RapidMiner platform, enabling the production, deployment, and management of enterprise-scale data science projects.

RapidMiner Studio 8.0

New operators, better operator docs, Fuzzy search – December 2017

80-operators-fuzzy

  • Parallelized optimization of model parameters
    • Optimize Parameters (Grid)
    • Loop Parameters
  • Easier to understand operator documentation
  • Fuzzy search allows to find operators more easily

RapidMiner Server 8.0

Highly scalable micro-services based architecture for enterprise Data Science Collaboration – December 2017

server-executions-1

The new architecture will provide improved reliability and horizontal scalability, setting the stage for enterprise collaboration and resource management.

  • A new distributed architecture – delivering unlimited horizontal scalability
  • Containerized job executions – ensures operational continuity and system stability/li>
  • Flexible and Enhanced configuration capabilities – dedicated resource management
  • An improved Server UI – providing a better experience for monitoring jobs, executions and more

RapidMiner Radoop 8.0

HiveContext Support, Cloudera Spark library upgrade and more

  • New K-Means operator is available that is based on the Spark MLib/ML clustering algorithm
  • HiveContext is now available in Spark Script, if the user has the appropriate privileges
  • And more….

80-spark-k-means