RapidMiner 9.8 continues to innovate in data science collaboration, connectivity and governance.
Overview of Platform Enhancements
- Advanced AI governance and collaboration in the AI Hub: Optimize performance when working with large (or very large!) files in projects and effortlessly exchange content between projects – all in effort to make it easier to promote development versions to production environments.Additionally, queue owners can now put one user in charge of the execution environment to better fit the needs of their organization.
- Enhanced connectivity in enterprise environments: Simplified proxy setup with automatic configuration of settings (based on OS settings or PAC files), improved Radoop proxy to streamline connectivity to Hadoop cluster (including support for KMS or Timeline Service traffic), support for connections to utilize external data in real-time use cases on the edge (using Real-Time Scoring nodes) and for MS-SQL in-database processing.
- Continued investments into data science innovations and governance: Fully centralized Python and R coding environment management with the AI Hub (now including kernels inside Notebooks), fast and efficient K-Means clustering based on H2O’s implementation, several image mining and deep learning improvements such as a UI to monitor model training, ready-made Docker environments, and new operators and advanced (time-based) windowing functionality for time series operators.
- Branding customization of RapidMiner Go for OEM customers: Create a familiar and seamless branded experience for users with customizable UI elements. Customize the interface with your company colors, as well as a custom application name and URL.
RapidMiner 9.7 continues to put people at the center of the AI journey by fostering better collaboration without sacrificing governance.
RapidMiner Server is now the RapidMiner AI Hub
The RapidMiner AI Hub connects people, processes and systems to ensure AI delivers business impact. Collaboration is even easier between coders, Studio users and RapidMiner Go users. Read up on why we made the change.
The RapidMiner AI Hub now offers:
- New project-based repository offering unprecedented collaboration and governance for AI: Diverse teams can work together on use cases in a central location across automated, visual and code-based authoring styles. Enterprises can easily and iteratively convert ideas into business impact.
- Projects includes fine-grained version control based on git standards: Version-controlled projects enable iterative and collaborative, yet governed, development of AI, delivering a rare combination of agility and traceability in your AI development lifecycles.
- Enterprise-grade identity & access management: To support collaboration at large scale, we’ve introduced a new identity and access management framework, including Single Sign-On to create a consistent experience across the platform (Go, Studio and AI Hub, as well as JupyterHub, our platform admin tools, and our interactive dashboarding capabilities).
RapidMiner AI Hub
Central collaboration and governance
The new projects with universal data storage (ie storage of any file type) drive seamless collaboration across different AI authoring styles (automated, visual and code-based). Git-based version control tracks all changes as “snapshots” allowing users to easily “roll back” which enables smoother collaboration and conflict resolution.
Identity and access management
RapidMiner AI Hub now ships with a new identity provider component based on the open source component Keycloak. It provides precise access controls by user, group and role, and a seamless single-sign on experience across the platform. It also integrates with all common identity providers, user databases and more.
RapidMiner Notebooks, the JupyterLab interface in the RapidMiner platform, is now fully integrated with the new projects framework, allowing for seamless and easy collaboration of coders with other users as and full traceability of the code-based work within the AI lifecycle.
The RapidMiner AI Hub now features a new dashboard providing insight into what’s going on in your system, including: executed and failed jobs, disk usage of projects, configured schedules, web services, and more.
RapidMiner Studio has been enhanced to support the use of the projects framework that’s new to the RapidMiner platform.
New file format for high-performance
Studio now leverages HDF5 for data storage, which enhances stability and performance with large amounts of data. It also enables easier and faster data exchange with Python.
Improvements to augmented machine learning
Auto Model reduces memory usage and run times and allows multiple Auto Model jobs to be submitted to the AI Hub at once.
Model Ops offers flexible model storage options for deployed models and unused and ID columns are now kept in the results after scoring for enhanced audits.
Updated H2O library
Enhances performance, exposes weights for deep learning and more.
We’ve added a number of new enhancements to Radoop that improve convenience and ease of use.
RapidMiner 9.6 is putting people at the center of your AI journey. What do we mean by that? We’re aiming to make machine learning accessible to anyone and drive collaboration between people of different backgrounds and preferences.
A brand new, fully automated and guided offering, built for users with minimal data science experience. All you need is a data set and a few minutes. RapidMiner Go is accessible through a browser, so there’s no need to download anything or use local machine resources. RapidMiner Go is tightly integrated with the rest of the platform so business users can prototype models and collaborate with more experienced users to get the models into production.
RapidMiner Studio 9.6
Model Ops works with any models
Model Ops is now the go-to way for your entire analytics team to universally manage and operate all of your models across your enterprise, no matter how they were created. We support custom-created models, code-based models and even complex models with multiple pre-processing steps.
Time series and time zone enhancements
We’ve solved some common challenges with using time-based data, including brand new operators that improve time series-specific data transformation and feature extraction.
RapidMiner Server 9.6
Code in the RapidMiner platform with JupyterHub
We’ve made it easier for full-time coders to work more collaboratively with non-coders by co-deploying JupyterHub with RapidMiner Server. This includes SSO and easy connection to the repository.
Easily create interactive dashboards and visualizations with your results
We’ve integrated popular open-source dashboarding technology that’s easy to use and offers a wide variety of visualizations.
Stop all the jobs
A new “clean-up” button can reset the Server queue and stop all running and pending jobs, leaving the Server in a clean state, ready to start anew.
RapidMiner Radoop 9.6
Enterprise-ready experience enhancements
We’ve added a number of new enhancements to Radoop that improve the experience, especially in larger enterprise environments.
Time series and time zone enhancements
We’ve solved some common challenges with using time-based data, including some brand new operators that improve time series-specific data transformation and feature extraction.
RapidMiner 9.5 is all about optimizing your administrative experience. Whether you’re upgrading your deployment, managing a Docker-based deployment, or connecting to the latest version of Hadoop, you will appreciate these new enhancements.
RapidMiner Studio 9.5
Upgrade RapidMiner Studio independently from Server
Connect to and access data and processes on older Server versions (9.0 or above) with any current or future Studio version. The latest Studio release will verify executability of processes stored on Server.
RapidMiner Server 9.5
Instant execution in Server
We have drastically improved latency in Server executions. Run short jobs 10x faster than before!
Easily manage Docker-based RapidMiner installations
The new RapidMiner Docker Deployment Manager provides an easy-to-use web interface to scale the number of JCs in your environment, re-configure or restart any component without specialized experience.
Full control of SLAs on queues – limit the lifetime of jobs
Control your job queues and make sure your Server time is spent in the best way. Now you can define a maximum execution time for jobs, so that, if jobs fail to finish on time, they won’t saturate the queues and affect the SLAs defined for others. It’s now easier to create reliable execution windows in shared environments.
Server settings via API
Fully automate Server deployments by programmatically editing the settings to your chosen values.
RapidMiner Radoop 9.5
Radoop Proxy for Hadoop 3
We have enhanced Radoop Proxy to work seamlessly with clusters based on Hadoop 3 (such as Cloudera CDH 6.x or HDP 3.x). This means that working with the latest version of Hadoop is easier and more secure.
Revamped general and connection-level settings
To make Radoop more user-friendly, we moved most of the settings from the RapidMiner Studio Preferences to Radoop connections.
The theme of 9.4 is deployment. It’s easy for anyone to deploy and manage models, and we’ve streamlined platform deployment with the RapidMiner AI Cloud.
RapidMiner Studio 9.4
Automated Model Ops
An easy way for Auto Model users of any skill level to deploy models into production. You can analyze performance of multiple models per use case, monitor governance issues and even swap in better performing models. When combined with Turbo Prep and Auto Model, this creates a complete path to fully automated data science. Read more here.
Enhancements to Auto Model, including Profit-Sensitive Scoring
We’ve added a unique capability which allows business users to input cost and revenue variables so models will self-optimize for profitability. We’ve also created a new, clean UI so it’s easy to understand how models were built.
New Data Prep and Modeling Operators
Easily clean your data for modeling and scoring with new operators such as: Replace All Missings, Handle Unknown Values, One Hot Encoding and Append (Robust). You can also rescale confidences – even for classification with more than two classes.
Enhanced Visualizations & New Charts
Help you tell a compelling and intuitive story about your data and your models.
Time Series Enhancements
We’ve added a number of new features that make it easier to create and validate accurate time series forecasting.
RapidMiner Server 9.4
Managed Offerings in the RapidMiner AI Cloud
Users can deploy models into production without heavy lifting from IT. Forget about managing RapidMiner on your own, our highly trained DevOps team can do it for you in our cloud environment.
Real-Time Scoring Now Available As Cloud VMs
Provides easy, fast, scalable deployment in the cloud. Find the new VMs in the AWS and MS Azure marketplaces!
Auto Model Web
A new browser-based version of the proprietary RapidMiner Auto Model technology, built for business users who know their data and use case, but don’t have advanced data science background (requires RapidMiner Server).
Improved Job Management
Makes it easier to remove old jobs and prevent overuse of server resources by other users.
RapidMiner Radoop 9.4
Flexible Hadoop Governance
Use any SAML-based enterprise SSO to access Hadoop via the Radoop Proxy. Leverage the new connection framework for connecting to Hadoop via the Radoop Proxy.
RapidMiner 9.3 makes it easier for beginners and experts to work together, speeds up model training, improves time series analysis, and more.
RapidMiner Studio 9.3
RapidMiner Python: drive collaboration between beginners and experts
Why choose between Python and RapidMiner? Augment your RapidMiner toolset with anything you can do in Python. Enhanced integration of RapidMiner and Python means you can select the best approach for the task at hand and enable collaboration of RapidMiner users working code-free and data scientists coding in Python.
Scale Auto Model in RapidMiner Studio with RapidMiner Server
We’ve given RapidMiner Studio the ability to scale automated ML to large data sets, leveraging the compute power of RapidMiner Server. Auto Model training will be much faster because you can run multiple algorithms in parallel on the distributed architecture of the RapidMiner Server platform and free up Studio to work on other tasks while the Server handles calculations.
Improved time series analysis and forecasting
RapidMiner now offers more operators to make it easier to perform time series analysis and forecasting. You can calculate a baseline forecasting performance as a benchmark for future analyses, understand auto correlation of time series and discover hidden patterns and even improve forecasting performance by isolating trend and seasonal components of time series data.
RapidMiner Server 9.3
Data connectivity at scale
RapidMiner Server now makes it easier create and manage a re-usable data pipeline that the whole organization can benefit from – with security and policy controls to ensure you don’t jeopardize data governance.
Enterprise user authentication
Implement RapidMiner Server with your preferred identity providers and security policies. RapidMiner Server leverages standard (SAML 2.0) security protocols to integrate seamlessly and supports password, token or multi-factor authentication.
RapidMiner 9.2 introduced new charts and visualizations, text analytics for RapidMiner Auto Model, easier RapidMiner Server updates, and more.
RapidMiner Studio 9.2
Text Analytics for RapidMiner Auto Model
Automatically extract features and categorize text content with built-in sentiment analysis and language detection. The new enhanced Auto Model automatically identifies, preprocesses and incorporates textual data for modeling. Leverage text content for prediction or clustering, run pre-built sentiment analysis or language detection and gain insights from text-specific visualizations.
Leverage Automatic Feature Engineering for clustering, improved extraction of features from date columns and the now incorporated Fast Large Margin and Multiclass Logistic Regression learners.
New charts and visualizations
We completely rewrote over 30 charts and visualizations including scatter plots, histograms, parallel coordinates, box plots, word clouds, and many more. The new charts provide rich customization options for colors, plot types, legends and axes. etc. Easily save and load chart configurations and export to various file formats.
Enhanced Time Series capabilities
Extend your time series modeling toolbox with new operators, which can provide exponential smoothing, allow you to calculate a lagged time series, or extract polynominal fit coefficients.
RapidMiner Server 9.2
Easier and automated upgrade
Incorporate new features and improvements fast and reliably. You can simply select upgrade in the installer and point to your existing home directory of RapidMiner Server. It will keep your old configuration and migrate if necessary. This is available for migrations starting from version 9.1 onwards.
Centrally managed Job Agent resources
Keep your RapidMiner Server nodes consistent and easy to configure by centrally deploying files like: Extensions, JDBC drivers, Licenses, custom libraries, and Execution context.
Deploy those files centrally within the RapidMiner Server home directory and they automatically get distributed among all connected Job Agents. Distribution will start when RapidMiner Server is (re)started or if licenses are updated in the web interface. There’s also a REST API available to trigger synchronization.
Large file management in the repository
Upload larger files (>2 Gb) as there is now no size limitation. It‘s easier to test your processes with real data or use the repository as a the container of any input data.
RapidMiner 9.1 introduced Automatic Feature Engineering, In-Database Processing, enhanced Time Series capabilities, high availability support for RapidMiner Server, and more.
RapidMiner Studio 9.1
Automatic Feature Selection and Engineering
Feature engineering is often the difference between good and great models. This release introduces a unique way to automatically select and generate features that balances model error with model complexity in RapidMiner Auto Model or by using the new Automatic Feature Engineering operator in a RapidMiner process.
Run data prep and ETL workflows from RapidMiner Studio directly inside your database. Simply design your workflow in Studio and we’ll convert it to SQL for you. This is especially important for cloud databases like Google BigQuery where you have to pay for the amount of data you query. We support MySQL, PostgreSQL, and Google BigQuery with more databases to come in the future.
Tackle the complexity of time series data
Understand trends and seasonality using the new time series decomposition operators. You can forecast with the Holt-Winters method and process nominal time series data with the new Windowing, Process Windows, and Replace Missing Values operators.
Read and apply Keras model on your dataset without touching Python.
Better Python and R integration
Link directly to existing R and Python files instead of having to cut + paste them into RapidMiner Studio. Support for Anaconda Python distribution was also added.
RapidMiner Server 9.1
Minimize downtime for your mission-critical applications through new support for multi-server active-active configurations.
Automate the scheduling of your processes
Integrate RapidMiner processes with external applications using the new and modern scheduler REST API.
RapidMiner Radoop 9.1
Support for HDP 3 and Cloudera 6
Now offering support for Hortonworks HDP 3, allowing you to take advantage of the amazing features like storage-saving erasure coding. This release also comes with an initial support for Cloudera 6 (limited to certain configurations).
RapidMiner 9.0 introduced RapidMiner Turbo Prep – a new data preparation experience inside RapidMiner Studio, improved time series modeling & forecasting, enhanced security and governance for enterprise deployments, and more.
RapidMiner Studio 9.0
RapidMiner Turbo Prep
Spend less of your precious time preparing data. Don’t let yourself get slowed down by clunky data prep tools or by not having a whole lot of data science expertise yet. Use the new RapidMiner Turbo Prep to easily transform, pivot and blend data from multiple sources with a few clicks while instantly seeing the impact of your actions on the data.
- Point and click: Intuitively interact with the data and immediately see how changes impact results.
- Blend, wrangle, and cleanse: Easily blend and join data from a variety of sources including relational databases, NoSQL, APIs, spreadsheets, applications, social media, and more. Quickly extract, join, filter, group, pivot, transform and cleanse your data.
- Re-use and share: Create repeatable data prep processes to save time. When you are finished, send your data directly to RapidMiner Studio or Auto Model for model creation, save your data as Excel or CSV or publish it to data visualization products like Qlik.
Tame the complexity of time series data in challenging use cases like demand forecasting or predictive maintenance. Analyze time series data with the new, now built-in time series modelling & forecasting capabilities: Forecast data using ARIMA or any Machine Learning based prediction model, cleanse your time series data by interpolating missing values or applying moving average filters, apply transformations like windowing or a fast Fourier transform (FFT) or perform feature extraction.
Admin control over settings and preferences
Govern product usage for analytics teams by putting guardrails on how RapidMiner Studio can be used, mitigating the risk of misuse. Pre-configure RapidMiner Studio installations within your organization and enforce important settings and preferences such as policies on password storage, extensions and operators available for use, and proxy settings.
Access to data on Google Cloud Storage
Easily access data on Google Cloud Storage using the new Read Google Storage, Write Google Storage, and Loop Google Storage operators which now accompany their Amazon S3 and Azure Blob Storage counterparts.
Pre-connected training and community repositories
Browse and learn from curated training resources or get inspired by sample content provided by community members, both of which are directly available through pre-connected repositories in RapidMiner Studio.
RapidMiner Server 9.0
Scalable repository for sharing data, processes and models
Manage your data science artifacts at scale – ready to grow to any demand from expanding analytics teams. Work with more and larger files and swiftly move folders around in RapidMiner Server’s new file-based repository. Speed has been improved by more than 10x.
Enhanced security for Enterprise deployments
Rely on full enterprise-grade security including password encryption, protection against Server-Side Forgery attacks, better session control and file-upload control.
Support for MySQL 8
Leverage the latest MySQL version 8.x as a configuration database for RapidMiner Server.
RapidMiner Radoop 9.0
Find the needle in the haystack. Identify fraud, detect abnormal consumer or machine behavior or spot rare but interesting facts by detecting outliers in your data on Hadoop with the new Anomaly Detection operator implementing the Isolation Forest algorithm.
Solve time series use cases like forecasting, predictive maintenance and others directly in Hadoop. Use the new windowing operator to easily restructure your time series data residing in Hadoop in a way that can be understood by prediction, clustering and outlier detection algorithms.
Reduce overfitting of machine learning models and improve their predictive performance by preparing data with the new discretization operators in RapidMiner Radoop. Binning or bucketing techniques are useful whenever the exact number representing a value is not meaningful and only adds noise and they are a great tool to prepare data for algorithms such as decision trees.
Support for MapR 6
Easily analyze data and use all the predictive power of RapidMiner Radoop in a Hadoop MapR 6 environment.
RapidMiner 8.2 introduced Real-Time Scoring, a new back-end capability designed for lightning fast scoring, easy configuration of your Hadoop cluster, easier process management in server, UI enhancements in Studio and more.
RapidMiner Studio 8.2
Give feedback directly from Studio
We have introduced a way to gather feedback on RapidMiner Studio in the product. Share your thoughts, help us improve Studio to make you and many other data workers more productive.
Error messaging, GUI tweaks, bug fixes and more
- We’ve started the process of making operators less prone to user errors and made the error messages more informative.
- Meta operator UI was changed, so it’s easily distinguishable from regular operators.
- Shortcuts introduced to boost process design.
- We rebuilt the FP-Growth operator from scratch and achieved significant performance improvement. We measured from 5x up to 20x speed up on real world use cases.
- And more…
RapidMiner Server 8.2
The new real-time scoring capability, is a back-end capability designed for use cases requiring very fast scoring like credit card fraud detection or real-time manufacturing controls.
- Lighting Fast at Scale – Receive, score, and return a result in less than 20 ms and scale to process very large amounts of requests at the same time. So, not only does it score individual calls fast, but it can be deployed and scaled horizontally to handle 1000s of calls per second.
- Simple and Easy Integration – Once set up, you deploy your models and related process via a package produced on the RapidMiner Server. These packaged processes for scoring are then exposed as Web Services that can be called from any application.
Better process monitoring and list management
- We have added several filters to the process list to make it more practical to use. No matter how long your process list, you can filter based on timeline, queue and duration.
- The new Execution Details view displays the status of the process, operator by operator, allowing users to follow the executions and monitor their progress.
RapidMiner Radoop 8.2
All the Hadoop and Spark settings and variables can be automatically imported by Radoop, but should you ever need to change or tweak any of them, we have added a nice categorized editor that helps users and administrators to refine the connections and test each section independently.
RapidMiner 8.1 introduces Auto Model, a new addition to RapidMiner Studio that uses automated machine learning to accelerate everything data scientists do when building predictive models #noblackboxes. RapidMiner Server is more secure and faster than ever and RapidMiner Radoop adds new support for MapR.
RapidMiner Studio 8.1
Auto Model, Global Search, new operators, performance upgrades, and more – February 2018
RapidMiner Auto Model: Accelerate Data Science Projects, without the Black Box
RapidMiner Auto Model accelerates the entire data science lifecycle using automated machine learning. It speeds data prep by analyzing data to identify common quality problems. It automates predictive modeling by suggesting the best machine learning techniques and then generating optimized, cross-validated predictive models.
RapidMiner Auto Model includes a new model simulator that delivers actionable insight for data science teams. It’s a simple visual interface where you can tune the individual parameters of a model to quickly explore how it performs under a variety of conditions.
Global Search: Everything is searchable, no more cumbersome folder structures
RapidMiner Server 8.1
Enterprise Data Science, faster and more secure – February 2018
- Enhanced protection against XXE attacks
- Protection against malicious upload executables and other dangerous files
- Encryption of LDAP connection password in Server
- Encryption of passwords in Studio
- Admin is now a standard role in Server and multiple administrators can share responsibility
New job containers send additional information, you can see more details of the running processes in real-time: status of the process, percentage of the progress, the current running operator, and others
RapidMiner Radoop 8.1
RapidMiner Radoop and the MapR Converged Data Platform now work together to make big data accessible for data scientists who don’t want to code on Hadoop and Spark.
RapidMiner 8.0 improves the reliability and scalability of the RapidMiner platform, enabling the production, deployment, and management of enterprise-scale data science projects.
RapidMiner Studio 8.0
New operators, better operator docs, Fuzzy search – December 2017
- Parallelized optimization of model parameters
- Optimize Parameters (Grid)
- Loop Parameters
- Easier to understand operator documentation
- Fuzzy search allows to find operators more easily
RapidMiner Server 8.0
Highly scalable micro-services based architecture for enterprise Data Science Collaboration – December 2017
The new architecture will provide improved reliability and horizontal scalability, setting the stage for enterprise collaboration and resource management.
- A new distributed architecture – delivering unlimited horizontal scalability
- Containerized job executions – ensures operational continuity and system stability
- Flexible and Enhanced configuration capabilities – dedicated resource management
- An improved Server UI – providing a better experience for monitoring jobs, executions and more
RapidMiner Radoop 8.0
HiveContext Support, Cloudera Spark library upgrade and more
- New K-Means operator is available that is based on the Spark MLib/ML clustering algorithm
- HiveContext is now available in Spark Script, if the user has the appropriate privileges
- And more….