At RapidMiner, we all know the power that artificial intelligence has to positively shape the future—in business and in the world at large. While many enterprises are relatively new to implementing AI, I’ve spent years scrutinizing prominent uses cases and staying on top of the latest trends.
The hype around AI has only grown in 2022, but sustaining trends are what differentiates baseless hype from cold, hard reality. In this post, I’ll break down the hype and the buzzwords to walk through what I consider to be the top five current AI and machine learning trends and how I predict they’ll impact data science in the years to come.
How Do Our Past Predictions Measure Up?
As I laid out in my data science manifesto, accountability is an essential part of a data scientist’s role. I’d be remiss if I didn’t acknowledge my previous AI predictions for 2020 and beyond to see if they came to fruition.
- Models controlling other models: RapidMiner helped fulfill this prophecy (which may be cheating—shh) by developing our own bias detection capabilities, though technology still has a ways to go before we can fully trust models to manage bias on their own. The spirit of the initial prediction being an increased emphasis on detecting model misfits and bias has certainly held true! More organizations are focusing on establishing fairness in AI. Salesforce, Airbnb, and Fidelity have recently instilled Chief Ethics Officers to navigate misfits and bias in ML models and create more ethical data science.
- Democratization of auto deep learning: While some progress has been made in increasing the accessibility of deep learning to ‘citizen users,’ and deep learning use cases are more widely adopted than they were two years ago, there are many more DL methods that could be used to solve complex problems. Computer vision technology, for example, could be further leveraged for supply chain visibility, anomaly detection, and edge computing (more on CV later).
- Training without labels: Transfer learning is a specific type of machine learning where we reuse a pre-trained model as part of a new ML model for faster and more accurate learning processes. The increased use of transfer learning, along with crowdsourcing capabilities like Appen crowdsourced labeling, made this prediction pretty spot-on.
- Accountability is the new accuracy: The honeymoon period for data science is certainly coming to an end, and while an increased emphasis on ModelOps has improved model monitoring and accountability, there’s still a major culture shift that needs to happen. The fact that most models still go un-deployed indicates to me that we are still lacking true accountability in our domain.
- Ensemble 2.0: Deep features and explainable AI: While deep features didn’t really come to realization in the way I expected, explainable AI, particularly understandable models, has become a huge topic. LIME and SHAP, two popular explainers, are both capable of abstracting away the hard part of understanding deep learning to make ML models more easily interpretable for citizen data scientists and business analysts.
Not every prediction we made was a home run, but overall, we did see great strides being made toward more ethical, more accountable practices in data science. Now, let’s turn our attention forward to my AI predictions for the years to come.
1. The Continued Evolution of Automated Data Science
AutoML tools aimed at making model creation more accessible to non-experts have become ubiquitous, and the rise of AI-as-a-service suggests that other advancements in automation and augmentation elsewhere in the AI development lifecycle will soon be commoditized as well.
Though AutoML can save time and energy while making predictive analytics more accessible to non-data scientists, its current capabilities cannot replace a real data scientist. Data scientists do so much more than just build models! They manage the entire AI development lifecycle.
That being said, I predict that 2022 will see further advancements in automation of commonly forgotten data science tasks and ‘soft skills.’ Imagine a program being able to perform use case analysis, and tell you what problems your data could best solve, in seconds. Other areas, such as bias detection, risk assessment, and target recommendation, are also rich with opportunities for innovation.
2. Bifurcation of the DSML Market
Leading experts and analysts like Gartner have reported a split of the DSML market into two segments—expert or “Engineering” and “Multipersona.” DSML engineering platforms are code-centric and aim to engineer ML-powered systems for data science. Multipersona platforms, on the other hand, are all about democratization and giving nontechnical audiences access to, and power over, data science capabilities.
One major machine learning trend I see taking place in the coming years is that AWS will establish a monopoly over the code-heavy engineering market segment – they simply have too many advantages, too much momentum, and too much value to offer developers. Meanwhile, multiple platforms with different strengths will meet the needs of multi-persona teams.
Platforms that prioritize upskilling their users are essential for reaching wider audiences, allowing collaboration between personas, and working more easily on LOB-specific DSML use cases.
3. Acceptance of Low-Code
The business benefits of low-code approaches are vast – particularly in data science. A diverse group of citizen developers can build their own models without having to get fully in the weeds of complex code. This is what the concept of ‘democratization’ is all about.
However, the benefits of low-code solutions don’t only apply to beginners. Going low-code allows experienced developers to focus on highly custom coding for complex business processes while freeing them up from building low-level models and applications. See this clip to learn how data scientists at Domino’s are scaling R-based models with RapidMiner’s low-code interface to make forecasting supply chain demand easier.
This may sound crazy (I’ve been called worse!), but I anticipate coders will become the drivers of low-code solution adoption. Using low-code solutions means that data science projects are more driven by the pull (for example: the user that wants a project) rather than the push (a stakeholder who wants to have something done). Adopting low-code means that developers can build models faster and delegate more tasks, giving coders the opportunity to be significantly more productive.
4. The Rise of Environmental AI
Sustainability initiatives are on the rise across industries, most notably in traditionally non-eco-friendly sectors such as energy, oil, and gas. With more and more organizations pledging to “go green,” the pressure is on for companies to implement technologies that allow them to decrease their environmental footprint while gaining revenue.
AI is already being used to combat environmental challenges like reducing energy waste, preventing future deforestation, combating marine pollution, and creating safe water treatment policies. While I predict AI to continue to be utilized for sustainable use cases, I also expect a new focus on reducing the carbon footprint of AI itself.
ML models learning from large quantities of data might be more accurate, but they also have a much more negative environmental impact. While increasing the use of transfer learning methods and the transition to cloud computing are a means of footprint reduction, there’s an opportunity to optimize the ML algorithms not only for faster runtimes or lower memory usage, but also for lower energy consumption. On top of this, exploring alternative methods to cooling data centers, using less physical hardware, and cloud providers’ commitments to go carbon neutral are key to truly creating greener AI.
5. Computer Vision Becoming Increasingly Pragmatic
I mentioned in my review of 2020 predictions that computer vision could be applied to many more use cases, and I see that happening in 2022. Computer vision has faced challenges in the past when running in the cloud, as it severely limits its real-world applications. Slow response times, high operating costs, and lack of privacy are all compounded by hardware requirements and scaling complexity.
Edge computing has made great strides toward making computer vision more accessible. Already, AWS has responded to developers’ desires to deploy CV models at the edge with the roll-out of AWS Panorama. Similar innovations in and simplification of computer vision democratizes complex AI projects and makes them more attainable.
I predict other providers to respond with their own solutions, making computer vision more accessible for projects that require real-time processing. Manufacturers, for example, can implement CV to detect defects in materials, monitor building entrances to increase security, and alert staff in real-time.
Are you curious about how to make these AI trends work for you and your organization? Check out a recent study we did with Forrester on demystifying data science use cases: Accelerate Your Data-Driven Transformation.