When we look at the development of on-line communities in the last two decades the software industry has been to the fore, and especially the open source movement, where an engaged and supportive community is essential to a product’s success. In many ways the success of MySQL, Hadoop, Linux and, of course, RapidMiner is down to their vibrant communities.

Of course, a central thread of the online community has been its non-commercial nature.   Put simply, they are about helping people use products rather than sell products. As B2C communities now spring up around telcos, fashion, makeup, gaming and a host of other areas this thread appears to be at breaking point. Attending a recent training course on community strategy, virtually all the community managers there agreed that the corporations have woken up to the power of communities.

At RapidMiner our aspiration over the coming years will be continue our role of enabling healthy conversation around data science, but to augment that with an open platform for facilitating and enabling successful data science projects and people.

The reason we need to do this? Because you keep asking us. If we look at the impediments to great data science projects many of them can be overcome with better data access, better data preparation, cleansing, model-building and testing etc etc. This we know and has informed our development of Studio and Server. But what we cannot expose easily is the “art” of data science, as Hofmann and Klinkenberg* describe it. These are the soft skills that only come with experience: experience of data science techniques, difficult data sets, best practices, and pitfalls to avoid. I heard this called Tricks, Tips, Traps and Techniques recently (which is much easier to type than say!)

So how can the RapidMiner Community help here? Well, we are going to make a start on the new Community site by providing you an area where you can describe your projects and invite feedback.  Beyond this, we will help you expose your skills and experiences to the community, and help you find people with skills and experience you might need. We will do this through a series of ‘badges’ that will appear in your profile and additional information that you might want to volunteer. If you have passed our certifications, others will be able to see this, along with if you have contributed a knowledge-base article on a particular subject, showing an area of expertise. We will also have massively enhanced systems of searching and tagging posts, and will introduce the concept of an Accepted Answer – guiding members more quickly to the right knowledge. Lastly we will provide “Ideas” areas, where you can tell us and others what you want from RapidMiner and from the Community itself.

The world of data science is moving incredibly quickly and so Community 2.0 is about sharing our collective knowledge, aspirations and challenges in order to help predict what the data science of the future might be like. A little like Wisdom of Crowds, but extended beyond the realms of Studio.

*Hofman, M. and Klinkenberg, R: Data Mining Use Cases and Business Analytics Applications, Chapman & Hall CRC Click Here