RapidMiner 9.2 introduces new charts and visualizations, text analytics for RapidMiner Auto Model, easier RapidMiner Server updates, and more.
RapidMiner Studio 9.2
Text Analytics for RapidMiner Auto Model
Automatically extract features and categorize text content with built-in sentiment analysis and language detection. The new enhanced Auto Model automatically identifies, preprocesses and incorporates textual data for modeling. Leverage text content for prediction or clustering, run pre-built sentiment analysis or language detection and gain insights from text-specific visualizations.
Leverage Automatic Feature Engineering for clustering, improved extraction of features from date columns and the now incorporated Fast Large Margin and Multiclass Logistic Regression learners.
New charts and visualizations
We completely rewrote over 30 charts and visualizations including scatter plots, histograms, parallel coordinates, box plots, word clouds, and many more. The new charts provide rich customization options for colors, plot types, legends and axes. etc. Easily save and load chart configurations and export to various file formats.
Enhanced Time Series capabilities
Extend your time series modeling toolbox with new operators, which can provide exponential smoothing, allow you to calculate a lagged time series, or extract polynominal fit coefficients.
RapidMiner Server 9.2
Easier and automated upgrade
Incorporate new features and improvements fast and reliably. You can simply select upgrade in the installer and point to your existing home directory of RapidMiner Server. It will keep your old configuration and migrate if necessary. This is available for migrations starting from version 9.1 onwards.
Centrally managed Job Agent resources
Keep your RapidMiner Server nodes consistent and easy to configure by centrally deploying files like: Extensions, JDBC drivers, Licenses, custom libraries, and Execution context.
Deploy those files centrally within the RapidMiner Server home directory and they automatically get distributed among all connected Job Agents. Distribution will start when RapidMiner Server is (re)started or if licenses are updated in the web interface. There’s also a REST API available to trigger synchronization.
Large file management in the repository
Upload larger files (>2 Gb) as there is now no size limitation. It‘s easier to test your processes with real data or use the repository as a the container of any input data.
RapidMiner 9.1 introduces Automatic Feature Engineering, In-Database Processing, enhanced Time Series capabilities, high availability support for RapidMiner Server, and more.
RapidMiner Studio 9.1
Automatic Feature Selection and Engineering
Feature engineering is often the difference between good and great models. This release introduces a unique way to automatically select and generate features that balances model error with model complexity in RapidMiner Auto Model or by using the new Automatic Feature Engineering operator in a RapidMiner process.
Run data prep and ETL workflows from RapidMiner Studio directly inside your database. Simply design your workflow in Studio and we’ll convert it to SQL for you. This is especially important for cloud databases like Google BigQuery where you have to pay for the amount of data you query. We support MySQL, PostgreSQL, and Google BigQuery with more databases to come in the future.
Tackle the complexity of time series data
Understand trends and seasonality using the new time series decomposition operators. You can forecast with the Holt-Winters method and process nominal time series data with the new Windowing, Process Windows, and Replace Missing Values operators.
Read and apply Keras model on your dataset without touching Python.
Better Python and R integration
Link directly to existing R and Python files instead of having to cut + paste them into RapidMiner Studio. Support for Anaconda Python distribution was also added.
RapidMiner Server 9.1
Minimize downtime for your mission-critical applications through new support for multi-server active-active configurations.
Automate the scheduling of your processes
Integrate RapidMiner processes with external applications using the new and modern scheduler REST API.
RapidMiner Radoop 9.1
Support for HDP 3 and Cloudera 6
Now offering support for Hortonworks HDP 3, allowing you to take advantage of the amazing features like storage-saving erasure coding. This release also comes with an initial support for Cloudera 6 (limited to certain configurations).
RapidMiner 9.0 introduces RapidMiner Turbo Prep – a new data preparation experience inside RapidMiner Studio, improved time series modeling & forecasting, enhanced security and governance for Enterprise deployments, and more.
RapidMiner Studio 9.0
RapidMiner Turbo Prep
Spend less of your precious time preparing data. Don’t let yourself get slowed down by clunky data prep tools or by not having a whole lot of data science expertise yet. Use the new RapidMiner Turbo Prep to easily transform, pivot and blend data from multiple sources with a few clicks while instantly seeing the impact of your actions on the data.
- Point and click: Intuitively interact with the data and immediately see how changes impact results.
- Blend, wrangle, and cleanse: Easily blend and join data from a variety of sources including relational databases, NoSQL, APIs, spreadsheets, applications, social media, and more. Quickly extract, join, filter, group, pivot, transform and cleanse your data.
- Re-use and share: Create repeatable data prep processes to save time. When you are finished, send your data directly to RapidMiner Studio or Auto Model for model creation, save your data as Excel or CSV or publish it to data visualization products like Qlik.
Time series modeling & forecasting
Tame the complexity of time series data in challenging use cases like demand forecasting or predictive maintenance. Analyze time series data with the new, now built-in time series modelling & forecasting capabilities: Forecast data using ARIMA or any Machine Learning based prediction model, cleanse your time series data by interpolating missing values or applying moving average filters, apply transformations like windowing or a fast Fourier transform (FFT) or perform feature extraction.
Admin control over settings and preferences
Govern product usage for analytics teams by putting guardrails on how RapidMiner Studio can be used, mitigating the risk of misuse. Pre-configure RapidMiner Studio installations within your organization and enforce important settings and preferences such as policies on password storage, extensions and operators available for use, and proxy settings.
Access to data on Google Cloud Storage
Easily access data on Google Cloud Storage using the new Read Google Storage, Write Google Storage, and Loop Google Storage operators which now accompany their Amazon S3 and Azure Blob Storage counterparts.
Pre-connected training and community repositories
Browse and learn from curated training resources or get inspired by sample content provided by community members, both of which are directly available through pre-connected repositories in RapidMiner Studio.
RapidMiner Server 9.0
Scalable repository for sharing data, processes and models
Manage your data science artifacts at scale – ready to grow to any demand from expanding analytics teams. Work with more and larger files and swiftly move folders around in RapidMiner Server’s new file-based repository. Speed has been improved by more than 10x.
Enhanced security for Enterprise deployments
Rely on full enterprise-grade security including password encryption, protection against Server-Side Forgery attacks, better session control and file-upload control.
Support for MySQL 8
Leverage the latest MySQL version 8.x as a configuration database for RapidMiner Server.
RapidMiner Radoop 9.0
Find the needle in the haystack. Identify fraud, detect abnormal consumer or machine behavior or spot rare but interesting facts by detecting outliers in your data on Hadoop with the new Anomaly Detection operator implementing the Isolation Forest algorithm.
Solve time series use cases like forecasting, predictive maintenance and others directly in Hadoop. Use the new windowing operator to easily restructure your time series data residing in Hadoop in a way that can be understood by prediction, clustering and outlier detection algorithms.
Reduce overfitting of machine learning models and improve their predictive performance by preparing data with the new discretization operators in RapidMiner Radoop. Binning or bucketing techniques are useful whenever the exact number representing a value is not meaningful and only adds noise and they are a great tool to prepare data for algorithms such as decision trees.
Support for MapR 6
Easily analyze data and use all the predictive power of RapidMiner Radoop in a Hadoop MapR 6 environment.
RapidMiner 8.2 introduces Real-Time Scoring, a new back-end capability designed for lightning fast scoring, easy configuration of your Hadoop cluster, easier process management in server, UI enhancements in Studio and more.
RapidMiner Studio 8.2
Give feedback directly from Studio
We have introduced a way to gather feedback on RapidMiner Studio in the product. Share your thoughts, help us improve Studio to make you and many other data workers more productive.
Error messaging, GUI tweaks, bug fixes and more
- We’ve started the process of making operators less prone to user errors and made the error messages more informative.
- Meta operator UI was changed, so it’s easily distinguishable from regular operators.
- Shortcuts introduced to boost process design.
- We rebuilt the FP-Growth operator from scratch and achieved significant performance improvement. We measured from 5x up to 20x speed up on real world use cases.
- And more…
RapidMiner Server 8.2
The new real-time scoring capability, is a back-end capability designed for use cases requiring very fast scoring like credit card fraud detection or real-time manufacturing controls.
- Lighting Fast at Scale – Receive, score, and return a result in less than 20 ms and scale to process very large amounts of requests at the same time. So, not only does it score individual calls fast, but it can be deployed and scaled horizontally to handle 1000s of calls per second.
- Simple and Easy Integration – Once set up, you deploy your models and related process via a package produced on the RapidMiner Server. These packaged processes for scoring are then exposed as Web Services that can be called from any application.
Better process monitoring and list management
- We have added several filters to the process list to make it more practical to use. No matter how long your process list, you can filter based on timeline, queue and duration.
- The new Execution Details view displays the status of the process, operator by operator, allowing users to follow the executions and monitor their progress.
RapidMiner Radoop 8.2
All the Hadoop and Spark settings and variables can be automatically imported by Radoop, but should you ever need to change or tweak any of them, we have added a nice categorized editor that helps users and administrators to refine the connections and test each section independently.
RapidMiner 8.1 introduces Auto Model, a new addition to RapidMiner Studio that uses automated machine learning to accelerate everything data scientists do when building predictive models #noblackboxes. RapidMiner Server is more secure and faster than ever and RapidMiner Radoop adds new support for MapR.
RapidMiner Studio 8.1
Auto Model, Global Search, new operators, performance upgrades, and more – February 2018
RapidMiner Auto Model: Accelerate Data Science Projects, without the Black Box
RapidMiner Auto Model accelerates the entire data science lifecycle using automated machine learning. It speeds data prep by analyzing data to identify common quality problems. It automates predictive modeling by suggesting the best machine learning techniques and then generating optimized, cross-validated predictive models.
RapidMiner Auto Model includes a new model simulator that delivers actionable insight for data science teams. It’s a simple visual interface where you can tune the individual parameters of a model to quickly explore how it performs under a variety of conditions.
Global Search: Everything is searchable, no more cumbersome folder structures
Now you can find anything within your repository and operator list using a central search engine, this includes all processes, models, operators, extensions… even your past actions!
No need to search through all our folder structure any more: everything is now at hand!
New Operators, improved performance, extensions showcase
- We have re-factored a few operators, including Join, Correlation Matrix and K-Means to drastically improve performance, with up to x10 increases in speed
- New Replace Rare Values operator, Merge operator, Improvements for the Parametric Probability Estimator operator and more
RapidMiner Server 8.1
Enterprise Data Science, faster and more secure – February 2018
- Enhanced protection against XXE attacks
- Protection against malicious upload of executables and other dangerous files
- Encryption of LDAP connection password in Server
- Encryption of passwords in Studio
- Admin is now a standard role in Server and multiple administrators can share responsibility
Improved process monitoring
New job containers send additional information, you can see more details of the running processes in real-time: status of the process, percentage of the progress, the current running operator, and others
RapidMiner Radoop 8.1
Now supporting MapR Hadoop – February 2018
RapidMiner Radoop and the MapR Converged Data Platform now work together to make big data accessible for data scientists who don’t want to code on Hadoop and Spark.
RapidMiner 8.0 improves the reliability and scalability of the RapidMiner platform, enabling the production, deployment, and management of enterprise-scale data science projects.
RapidMiner Studio 8.0
New operators, better operator docs, Fuzzy search – December 2017
- Parallelized optimization of model parameters
- Optimize Parameters (Grid)
- Loop Parameters
- Easier to understand operator documentation
- Fuzzy search allows to find operators more easily
- Optimize Parameters (Grid)
- Loop Parameters