“Water distribution companies need to look beyond routine pipeline repair, such as saying ‘we’ll just replace 1% of the network each year.’ They need instead to adopt a monetized risk approach, applying data science to achieve more efficient repair operations. It’s a win on every front.”

The company hired EIS to use its data science expertise to help it better identify the parts of the pipeline that represented the highest overall risk, in terms of likelihood of failure (LoF) and potential consequences should a leak or break occur in that location. The company needed to optimize its repair investments, as the company operates on razor thin margins, given how little consumers are accustomed to paying for water. The company also often applies for government grants and loans to subsidize pipeline renewal, and needs to justify its request for funds and how and where they will be used, usually in terms of LoF.
“Water distribution companies have an obligation to their customers and their communities to ensure safe, clean and reliable drinking water.,” said Michael Gloven, managing partner at EIS. “At the same time, these companies generally have limited renewal budgets. Progressive water distribution companies are starting to use everything at their disposal, including advanced data science and machine learning techniques, to predict LoF, so they can spend wisely.”

“Looking beyond just regulatory compliance to assessing monetized risk in the pipeline, and applying data science to that need, means more efficient repair operations, more leaks prevented and fixed, a better bottom line, and less risk of catastrophic damage.”

A local distribution company (LDC) in the natural gas industry needed to predict which parts of its pipeline were at greatest risk of failure and consequence to nearby residents and businesses.
By overlaying the potential consequences of each predicted failure, the company was able adopt a monetized risk approach to managing the pipeline and as a result, the company’s pipeline operations are now safer, while spend decision-making is more efficient.Every company with a network of any kind – other kinds of fuel pipelines, power grids, telecommunications networks, even roads and bridges – should take note, as they could use data science in the same way with similar benefits.

“Automation reduces the experience of the operators regarding the process dynamics and can lead to information overload in critical situations. When control is lost, human life and the environment is endangered.”

The goal of the BMBF research project FEE was to detect critical situations in chemical plants at an early stage, and to develop assistance functions that support plant operators in decision making during critical situations.
An increasing degree of automation is essential to stay competitive in the chemical industry. However, this adoption also leads to a decrease in operator experience and an increased risk of overburdening operators, especially in the case of abnormal/critical plant situations.

“Our customer service team starts each day knowing the messages waiting for them are the right ones for them to prioritize to improve the customer experience. Our teams are more efficient and our customers are happier with the improved service.”

This customer service department was continuously bombarded with customer messages, delivered via email, website submissions and social media – almost 500 messages per day, on average. Not all of these messages actually needed the customer service team’s attention.
LIAT set up a process that uses RapidMiner’s text mining capabilities to parse every customer message received into key words and phrases. They also use RapidMiner to build a predictive model that classifies which department should be handling each message based on its content as analyzed via text mining.

“The ProMondi Project was a milestone in applying data science to optimize the production line.”

This ebook summarizes the completed Project “ProMondi— Predicting Assembly Plans in digital Factories” sponsored by the German Government (BMBF), of which RapidMiner was chosen as the primary data science partner. The aim of the ProMondi project was to develop a method for the prospective determination of assembly work contents in an early phase of the product development process. Two of the use cases from this project were Daimler and Miele.

“The project partners have identified huge potential for the application of machine learning to predict product defects early in the production line. Some of these ideas are currently being deployed and tested in plants across the world.”

This case study focuses on the data science aspects of a completed research project called PRESED (Predictive Sensor Data mining for Product Quality Improvement), co‐financed by the Research Fund for Coal and Steel of the European Commission, that RapidMiner was proud to be a part of.
The experts helped the steel manufacturers understand where the defect came from and enabled the team to find a solution to fix it, but the answer was not always immediate, and it required some investigation. These investigations were what this research project aimed to fast-track and optimize, as the potential upside was huge, greatly reducing the waste of energy and resources.

“Machine learning allowed the US state auditor to integrate and consider various data sources, create meaningful features and scores, provide context and explanations, and detect networks of fraudsters.”

This case study focuses on how a US state auditor leveraged machine learning to detect and prevent fraud in healthcare. The biggest challenge of fighting fraud in healthcare is that fraudsters often are intelligent, learn from mistakes, and continuously create new types of fraud. Hence, techniques that can capture known fraud patterns, as well as, new types of fraud are needed.

“RapidMiner has increased Operational Analysis’s capacity to support the wider business’s work, and more importantly, to improve the travel experience of London’s residents and visitors.”

Transport for London (TfL) is the integrated transport authority responsible for keeping London moving and the day-to-day operation of London’s public transportation network and the management of London’s main roads.
The RapidMiner users at TfL are part of Operational Analysis, a data science and analytics team within the Network Performance department. Network Performance is responsible for the safe and efficient operation of the road network, managing the traffic signals and ensuring safe, high-quality roadworks across the city. Their mission also includes achieving more progressive goals such as increasing the usage and efficiency of sustainable modes of transportation (bikes and buses), and limiting real-world disruption by modelling and visualising future changes.

“This manufacturer sees lots of opportunity for data science to improve product quality and production yield, many involve predicting with great precision exactly how a step in the production process should be executed, to maximize the desired results.”

As a leading silicon wafer manufacturer, this company produces wafers in a variety of sizes, including specialty wafers with specific features to meet the particular needs of different customer types. It has a global customer base made up primarily of electronics and semiconductor manufacturers.
Given the minute, delicate, and sophisticated nature of circuits, microchips and the like, the quality of the silicon wafer is critical to the success of those companies which use them as inputs to their own products. So, this manufacturer invests heavily in ensuring its silicon wafers are of the highest possible quality. Because product quality and production yield more important than ever before, this manufacturer must increase silicon wafer production by decreasing loss due to defects and increasing the yield from its existing production facilities.

“The data science team uses the insights gained from RapidMiner to adjust practically every aspect of its operations to reduce customer support costs and improve its customer experience.”

As a manufacturer of high-end consumer electronics, this company must, and does, offer post-sales support to its customers. This includes giving customers the opportunity to call or email the company to ask questions about the products they have purchased, including requesting technical assistance.
The company’s objective is always to reduce the cost of providing this support, while at the same time ensuring customers have a positive experience with their products. Strategies to achieve this can vary from improving the content on the website, so customers can self-serve instead of calling for help, to better preparing call center agents for quick resolution of customer calls, to building products that require less support to begin with.
The data science team in the company’s post-sales organization has the mission of using data to help achieve these goals.

“The fusion of these two technologies allows us to go from an anecdotal approach to a data-supported approach that enables us to create more meaningful interventions and better patient care moving forward.”

An analytics division in a privately held healthcare company wanted to use their vast amount of patient treatment data to help drive better care and outcomes. They monitored each patient’s progression over their entire course of treatment, storing vast amounts of data in many different formats and across many facilities. This led to a complex dataset, which the company needed to quickly cleanse, simplify and draw fast, actionable treatment conclusions to share with doctors.
RapidMiner’s unified data science platform was chosen for its easy to use drag and drop visual programming and ability to integrate with 3rd party software like Tableau. This gave them the robust data prep and predictive modeling functionality of RapidMiner along with the ability to operationalize results directly into the user friendly, interactive dashboards of Tableau.

“RapidMiner enables us to react more quickly to the requests of our customers.”

Austria’s leading mobile phone service provider, Mobilkom Austria, received more than 800,000 emails every month; even after spam filtering more than 80,000 customer requests remain. Of course, customers expect a timely reply, especially when communicating through this medium.
Using RapidMiner‘s Data Science Platform, Mobilkom was able to analyze the textual content of incoming customer requests and automatically determine the topic of each request. This way, the email requests are automatically and quickly forwarded to the support person in charge for this topic and a competent answer is guaranteed within the shortest time possible.

“Civil lawsuits cost the American Economy an estimated $233 billion a year.”

Millions of patents exist and new ones are granted every day. These documents are publicly available, but it is still difficult to track and monitor the huge number of possible violations. Is someone violating one of our patents? Are we possibly violating someone else’s? Big corporations need to know.
This multinational chip manufacturer collects patent documents filed by competitors and uses RapidMiner’s text analytics and predictive capabilities to sort and track them.

“The ability to accurately forecast future sales allowed the company to contract for the correct amount of storage space, thereby avoiding waste and cutting unnecessary cost.”

This multinational pharmaceutical company sells thousands of different drugs. In order to optimize logistical operations and storage needs, the company needed to know future sales. If the company just looked at sales from the previous month as a predictor for the next month, the error could be 20 percent off in either direction. Better predictions = better process and huge logistical savings.
By using RapidMiner and looking at a variety of factors (not just the previous month’s sales), the company was able to predict its upcoming month’s sales within three percent across multiple product lines.

“In today’s world, if a patients experience problems with medications or medical devices, or if they have a great experience, they tell their friends, often on Facebook and Twitter.”

This giant pharmaceutical firm was looking for customer feedback. It wanted to know what people liked about its products. Did people prefer the company‘s product over other products? Did these preferences develop and change over time? In addition, the company was legally required to report any adverse product reactions, so a connection to customers was doubly important.
This firm focused on collecting publicly available information with RapidMiner, primarily from the diabetic community, specialized diabetes forums, blogs and the major social networks. The information was in the form of millions of individual texts and posts per year, far more than could be reviewed by human eyes. Is this text about the company’s product? Is it about a competitor’s product? Is the post about the consumer’s desires about the product or is it from real experience? Once the appropriate texts were identified, RapidMiner’s sentiment analysis tools were used to determine whether each one was positive or negative.


“RapidMiner’s predictive analytics platform makes maintenance and repair services more effective and efficient.”

Reducing Failures, Downtime and Cost with Predictive Maintenance

In the area of aircraft maintenance, it is vital to be able to predict airplane component or equipment failures and maintenance needs in order to reduce costly downtime, avoid unplanned out of service times, and to optimize service crew schedules. With over 1,000 airplanes to be maintained, Lufthansa had hundreds of thousands of log entries, sensor data, error messages, and maintenance reports that needed to be evaluated in order to accurately predict & prevent failures.
Lufthansa uses the RapidMiner Data Science Platform to offer predictive analytics services to their customers. Using RapidMiner’s real-time analytics of time series data, feature extraction, machine learning for regression, classification, and frequent item set mining, on the available airplane and service data, they were able to develop accurate models for predicting when maintenance should be performed.

“We wanted to predict the actual arrival time of a flight to within a 5 minute window, at the very moment the plane took off. This would typically give the Lufthansa operations team enough time to make any necessary adjustments at the arrival destination to minimize passenger inconvenience and costs for Lufthansa.”

Increasing Operational Profitability through Better Flight Arrival Time Predictions

When a Lufthansa plane arrives late at its destination, it does more than inconvenience passengers. In fact, what may seem like a tolerably small arrival delay for a passenger, say, less than thirty minutes, can have a severe operational impact on Lufthansa. A delayed arrival may impact catering service, flow of flight crews between aircraft, gate availability, connecting flights for passengers, and more. This can add up to significant costs for Lufthansa, especially in cases where a passenger misses a connecting flight.
Lufthansa was already making arrival time predictions for each flight before the Industry Solution team’s work began. But the team has determined that its new models built with RapidMiner are significantly more accurate at the time of aircraft take-off, that critical moment when the operations team want to be able to start making adjustments at the flight’s destination if needed.

“The seamless integration of RapidMiner’s lightning fast data science platform and QlikView’s strong visualization capabilities provides a Customer Segmentation solution to drive revenue optimization.”

One of the most important challenges people are facing is customer segmentation due to multiple data sources and different business needs. Because IT has to prepare the data and modify the model to fit each business scenario manually—time and money is lost in process.
Connect, combine and analyze both structured and unstructured data coming from multiple datasources with RapidMiner. Create an automated segmentation system based on users specified parameters provided directly from QlikView dashboard. Use QlikView connection to enable users to trigger a workflow directly from QlikView and evaluate different models based on those parameters.

“Because of RapidMiner’s ease-of-use, especially the drag and drop modeling interface, within days our team was using it productively.”

Modern Marketing Concepts, Inc. (MMC) is a global leader in the business-to-business marketing services industry, offering innovative marketing solutions across multiple industries, including the building trades and healthcare. Based in Binghamton, NY, MMC has over 25 years of experience changing the way its clients market and sell their products, with turnkey, full-service marketing applications and services that fine tune campaigns to optimize results.

“RapidMiner is extremely powerful, has the best operators, and can handle Big Data from wearables. It also allows us to rapidly prototype very sophisticated analytics, machine learning, and classification applications, saving significant time and money.”

Based in Stonington, Conn., Body Biolytics is focused on applying activity-recognition software to the sport, fitness and health industries. The company’s technology is field-proven, having been installed on over 40 U.S. Navy ships to keep a close watch on machinery health by collecting data from hundreds of on board sensors. Using predictive analytics on this data, the software predicts machinery failures, allowing maintenance crews to take corrective action in advance of any problems.

“Using RapidMiner’s data mining functionality, we can do risk analysis, checking for errors or omissions, flagging certain substance or products, and searching from alternatives.”

Established in 2012, with the project lead headquartered in Stuttgart, Germany, SustainHub provides a systematic and efficient approach to collect compliance and sustainability data for products and manufacturing processes through the supply chain, and integrates these into the internal systems and processes of companies. This leads to better management of supply chain data and sustainability data, and improves the eco-efficiency performance of product design and production.

“One of the great things about RapidMiner is its ability to process customer feedback in multiple different languages.”

Founded in 1998 and acquired by eBay in 2002, PayPal is the faster, safer way to pay and get paid online, providing simpler ways to send and receive money around the world. With 143 million active accounts in 193 markets and 26 currencies around the world, PayPal enables global commerce, processing more than 8 million payments every day.
For Han-Sheong Lai, Director of Operational Excellence and Customer Advocacy at PayPal and Jiri Medlen, Senior Text Analytics Specialist at PayPal DT, driving customer satisfaction and reducing customer churn are never ending challenging tasks. Han and Jiri knew that figuring out what drives product experience improvement without adequate knowledge of customer perspective and feedback is like “shooting in the dark,” hoping that opinion-based actions translate into tangible business improvement. By applying basic voice-of-the-customer-concepts and text analytics to customer feedback in over 60 countries worldwide, Han, Jiri and their team were able to identify, classify and count customers as “top promoters” and “top detractors,” according to their feedback verbatim.