student looking at a whiteboard

18 August 2022

Blog

What I Wasn’t Taught About Data Science

August 5th marked the end of my first month as RapidMiner’s Product Marketing Co-Op. So far, it’s been an incredible opportunity to meet and learn from so many new faces, experience the inner workings of a business, and most notably gain a whole new perspective on data science.  

I’m currently an economics student at Northeastern University, and in my second year I decided to pursue a minor degree in data science. I wanted to learn how I could efficiently leverage data to my advantage and generate insights beyond what was obvious from my statistics and economics classes.

Interested in learning more about data science, I chose to join RapidMiner and gain exposure to the business applications of machine learning and artificial intelligence. 

My Academic Experience with Data Science 

At Northeastern, I’ve followed the traditional route for data science minors and taken Python coding classes up to the intermediate level. I’ve always enjoyed math and problem solving, which is what my friends and I imagined coding would be like, and in many ways this is true.

I enjoy the thinking process behind code and get satisfaction from understanding its creative yet systematic approaches to complicated problems, but unfortunately, I only ever got a glimpse of that side of it.  

Instead, introductory coding classes focused most of our energy on learning, practicing, and memorizing inefficient basic level functions that have limited practicality. My entire first course taught me how to manually load in data files, read their contents, clean the data, and finally visualize it.

After a few months of learning to code, all we had to show for it was a slightly boxier and less appealing version of an Excel chart I could’ve made in 5 minutes.  

Granted, this was only the first step into the field of data science through coding, but its steep learning curve and lack of practical applications meant many of my friends lost interest and patience in it, believing data science wouldn’t be of benefit to them without years of practice.

In many ways they were right, and the situation I’ve experienced at Northeastern perfectly represents the biggest obstacle facing data science right now. In its current form, coding poses a considerable barrier to entry, intimidating newcomers and limiting data science’s accessibility to only those willing to make serious commitment. 

A New Perspective: Data Science Outside of the Classroom 

During my first week at RapidMiner, I was instructed to take a course on their educational platform called RapidMiner Academy.

The course, Applications and Use Cases, was my first insight into RapidMiner’s vastly different ideology around data science.

Instead of teaching data science as a language, like my university classes were, this course described the methodology behind approaching a business problem with data science.  

I was shocked really to see how different RapidMiner’s foundation level data science course was to my university classes, and their approach left me more interested in data science and machine learning than I was before.

I was learning the importance and steps of processes such as assessing and deciding what you want to learn from your data, how to prepare and create models from your data, how to effectively illustrate insights, and methods to validate your models before implementing them into real-life applications.

This holistic approach was new to me, and I immediately felt like it had been missing from all my university classes.  

Soon after completing my first course on Academy, my tasks began involving RapidMiner’s platform (which is what I was most excited for!). I’d seen demonstrations and tutorials for the product before, but I’d never had an opportunity to experiment with it myself.

Within an hour, I’d gotten a basic lay of the land, and I could generate outputs using the AutoML feature, which made me realize how accessible these tools were. Coding would eventually enable a more in-depth and tailored approach to data science, but AutoML’s efficiency was unmatched and made machine learning accessible to all experience levels.  

Over the following weeks, I’ve developed a better understanding of the platform and its capabilities, building out machine learning workflows of my own, recreating and even improving past projects of mine that I did in Python.  

Final Thoughts 

Learning the ropes at RapidMiner and gaining exposure to their product, training courses, and culture has shown me that data science is more powerful and accessible than most people realize.

I no longer believe that anyone needs to spend months learning to code to successfully leverage data to their advantage—they simply need an intuitive mindset and to understand the methodology behind data science. 

The future of data science is no longer just Python coders, but the adoption of machine learning and artificial intelligence in every department of a business. If coding feels like too much of a commitment or obstacle for you—as it did for some of my friends at university—there are many other ways for you to accelerate and improve your work with data.

As an academic learning Python myself, I can officially say I’m excited by no-code alternatives like RapidMiner. I’d encourage you to check out how they cater to academics for yourself.

Related Resources