Extended Operations for Nominal Values

One of the next versions of RapidMiner (5.0.011 or the upcoming version 5.1) will provide a nice extension of the expression parser which is for example used for the operator “Generate Attributes”.  The operations are performed with the operator “Generate Attributes” and can be used directly within the expressions for the new attributes.

The supported functions include

  • Number to String [str(x)],
  • String to Number [parse(text)],
  • Substring [cut(text, start, length)],
  • Concatenation [concat(text1, text2, text3…)],
  • Replace [replace(text, what, by)],
  • Replace All [replaceAll(text, what, by)],
  • To lower case [lower(text)],
  • To upper case [upper(text)],
  • First position of string in text [index(text, string)],
  • Length [length(text)],
  • Character at position pos in text [char(text, pos)],
  • Compare [compare(text1, text2)],
  • Contains string in text [contains(text, string)],
  • Equals [equals(text1, text2)],
  • Starts with string [starts(text, string)],
  • Ends with string [ends(text, string)],
  • Matches with regular expression exp [matches(text, exp)],
  • Suffix of length [suffix(text, length)],
  • Prefix of length [prefix(text, length)],
  • Trim (remove leading and trailing whitespace) [trim(text)].

It is amazing how many new data transformations you can perform with this simple set of text operations. Actually, I often had to use the operator “Execute Script” for this type of operations which is now no longer necessary.

I have also just uploaded a process on myExperiment , which can be directly downloaded with our Community Extension (but of course you will need the RapidMiner update first 😉 ). The process is named “Extended Operations for Nominal Values” – just like this blog entry.

Ingo Mierswa

Ingo Mierswa is the founder and president of RapidMiner and an industry-veteran data scientist since starting to develop RapidMiner at the Artificial Intelligence Division of the TU Dortmund University in Germany. Mierswa, the scientist, has authored numerous award-winning publications about predictive analytics and big data. Mierswa, the entrepreneur, is the founder of RapidMiner. Under his leadership RapidMiner has grown up to 300% per year over the first seven years. In 2012, he spearheaded the go-international strategy with the opening of offices in the US as well as the UK and Hungary. After two rounds of fundraising, the acquisition of Radoop, and supporting the positioning of RapidMiner with leading analyst firms like Gartner and Forrester, Ingo takes a lot of pride in bringing the world’s best team to RapidMiner.