High Volume, High Frequency, Low Latency Data Processing

With the Streaming Extension for RapidMiner

Sebastian Land, Old World Computing

Many deployment scenarios require the application of prediction models on a high volume of events arriving with a high frequency. If low latency is key for automating a process, conventional ways of deployment fail to satisfy all requirements. A web service based deployment cannot efficiently fulfill the volume requirements, while batch oriented Big Data approaches will not achieve a low latency. Stream processing combines the scalability of a Big Data approach with the low latency of a web service call.

During this talk, we will outline use cases from pure real time reporting to applying predictive analytics. For each use case we will show how they can be implemented using the Streaming Extension in combination with the established RapidMiner platform. Finally, we will sketch a typical infrastructure for the different use cases.