Data Science is the new Computer Science

Share on twitter
Share on facebook
Share on linkedin

(Reposted from LinkedIn)

A long, long time ago I studied Math and Computer Science at the University of Illinois. I spent most of my high school years as an aspiring BASIC programmer, and went to Illinois for its computer science program.  My high school advisor encouraged me add math to my degree.

I had an amazing experience at Illinois, where I witnessed the birth of the web thanks to a classmate named Marc Andreessen. I still have a vivid memory of my computer science classes, which ranged from highly practical classes in C++ (well practical at the time, this was before Java), to theoretical classes in algorithm optimization and proofs.

But the class that stood out to me most was CS 357 – Numerical Methods I. It was by far the most difficult class I took at Illinois, and it broke me. I ended up getting a C, and was thrilled to never again have to worry about things like Fast Fourier Transforms, Monte Carlo algorithms, and Matrix theory.

But then I joined RapidMiner, and began to learn about data science, and now my nemesis — the Fourier Transform – is back on my mind

After more than 20 years, I finally understand the practical applications of all of the math I learned in college. For example, here’s a real homework assignment from the CS 357 class (this would be absolutely trivial to solve using RapidMiner.)

You are given a table with the percentage of 18-29 year olds in the United States that used social media in the years from 2005 to 2012. This is actual data taken from Pew Research Center. Find the line of best fit through the data using linear least squares.

To me, data science is the new computer science. The practical applications are more clear than ever. For example, every morning when I wake up, I grab my iPad and open Flipboard. I highly suggest subscribing to the data science Flipboard topic, as there’s some fantastic content. This morning, I came across two articles that illustrate the real world impact of data science.

The first was from the Boston Globe on how a data scientist is dominating the daily fantasy sports business. Saahil Sud was a marketing analyst with a degree in math and economics who uses predictive analytics to determine the optimal fantasy roster on DraftKings. His predictive models use data from a huge range of sources – ranging from historical player performance to weather to the dimensions of a ballpark, and much more. Through smart math, Sud has made more than $3.5 million in 2015.

The second article from Fortune was titled Why Facebook Profiles are Replacing Credit Scores. Max Levchin of Paypal fame (who also happens to be an Illinois grad!) founded a company called Affirm, who analyze a wide variety of structured and unstructured data sources to predict the best loan candidates. For example they look at social networks like Facebook and even GitHub. Who knew the GitHub contributions are a positive predictor of credit worthiness? According to ZestFinance, firms using predictive analytics on a variety of data sources reduce loan default rates by 40%.

So thanks high school advisor for encouraging me to get a math degree. While I certainly don’t pretend to know anything about data science, joining RapidMiner has completed the journey I began way back at Illinois. Math is the clearly future of business, and I’m happy to be squarely in the middle of this transformation at RapidMiner.

Tom Wentworth

Tom Wentworth

Chief Marketing Officer at RapidMiner