MLOps for AI at Scale

By adopting an MLOps approach, enterprises can optimize pipelines for AI at scale across key functions.

MLOps for AI at Scale

It’s no surprise that machine learning and artificial intelligence offer great promise. By integrating these capabilities into applications and solutions that users rely on every day, organizations can drive process efficiencies, improve data-driven decision-making, increase revenue, and reduce costs. But AI-enabled transformation is often stalled due to talent shortages, lack of quality data, difficulty integrating emerging technologies, and evolving AI-related risks. Recent studies show that almost 50 percent of pilot projects never make it into production, and on average, it takes seven months.1 That’s where MLOps, or machine learning operations, can help.

To address this immediate pain and achieve quick wins, many enterprises turn to commercially available and generic “pre-packaged AI-enabled applications” that boast a dazzling front-end interface and promise of value. Unfortunately, these short-term “fixes” lead to a new challenge – a portfolio of clunky solutions that can be hard to integrate, can’t scale to meet enterprise needs, and possibly lead to more siloed data and long-term vendor lock.

By shifting from a vertical (and likely, proprietary) approach to an enterprise architecture strategy that incorporates horizontal, repeatable pipelines that support every aspect of a ML model’s lifecycle, you can yield speed, scale, and generate value from your AI investments. In essence, it’s applying the same rigor and principles of DevOps to ML, otherwise known as MLOps. By adopting an MLOps approach, enterprises can optimize pipelines to enable AI at scale across key functions: model deployment, integration, running, and monitoring.

Deployment & integration

With an MLOps approach, you can significantly reduce the time it takes to move a model into production. With powerful pipelines that can automatically deploy models from nearly any training tool or framework into open-source containers, you’ve turned your models into immutable API endpoints that any modern software developer can use or integrate into any application.

Today’s Challenges

  • Models are written in different languages and there’s no standard approach to deployment
  • Models sit idle in the lab, and there’s no efficient way to move them to production
  • There’s no easy hand-off between the teams building models and the teams integrating them into production applications

A Better Way

  • Containerize all your models – your software developers and DevOps teams are already familiar with working with containers
  • Configure a repeatable pipeline to field all your models for deployment
  • Create a central space for your data scientists building models and the engineers integrating them
  • Shift to an API-based approach that turns models into API endpoints that can be integrated and run anywhere

“The mark of a mature, digital native organization is the presence of an integrated foundation of software, data, and AI with consistent architecture and integrated APIs.2

Running models

One of the most common questions encountered is where should the AI/ML models run—centralized in your cloud, on-premises infrastructure, or as close to the data source as possible? The answer to this question is unique to each use case and organization; however, the reality is that hardware and infrastructure requirements can and will change, and so now is the time to optimize for flexibility for the future. Shifting to an MLOps mindset where models are API endpoints that can be run anywhere will help you keep hardware and infrastructure options open, while allowing you to be mindful of considerations like cost and processing latency requirements.

Today’s Challenges

  • Diversity of AI use cases and risks related to running models in the cloud, on-premises, or at edge locations
  • Managing unexpectedly high operational (read: cloud) costs for running AI applications

A Better Way

  • Separate your ML/AI pipeline from existing hardware / infrastructure
  • Choose tools that offer model deployment flexibility for any type of hardware
  • Calculate the tradeoff between cloud costs and your needs; in some cases it makes more sense to consider on-premises deployment 3
  • MLOps tools can enable smart infrastructure autoscaling to scale up and down and better manage costs

Monitoring models

In cases where teams successfully productionize ML/AI models, many fail to consider that models must be monitored once they’re integrated and running in production applications. An MLOps solution can automate production monitoring of models, which helps detect and alert when a model’s performance is drifting and can inform model retraining. By centralizing model monitoring and alerting, teams can reduce the need to manually “babysit” models, all while increasing transparency and enabling better accountability and governance.

Today’s Challenges

  • Models are inconsistently deployed, making performance monitoring difficult
  • Monitoring models is a manual process, and “babysitting” is time consuming
  • Teams face bad outcomes due to drifting models with limited transparency into why models behave in a certain way
  • Stakeholder groups all have different monitoring needs and risk tolerances

A Better Way

  • Automate monitoring, logging, and alerting for model-specific metrics
  • Monitor model drift (both data drift and model drift) to identify and trigger model retraining loops
  • Ensure results are explainable; document predictions, enable human-in- the loop feedback
  • Enable tracking for job history to enable auditability, transparency, and accountability

MLOps for AI at Scale

Ultimately, many of the challenges teams face with embedding AI across the enterprise stem from a shortcut-first approach that prioritizes “pre-packaged applications” that can’t adapt to an organization’s shifting needs. While this approach might seem efficient in the near-term, it only compounds challenges across the pipeline in the long-term. By shifting to an MLOps mindset that’s characterized by a horizontal approach that enables speed and scale, teams can move past long development cycles, vendor lock, and more failed initiatives.

Fortunately, there are lessons to be learned from those who dared to forge ahead, and there are design patterns for building out your AI tech stack. To get started, check out the next installment in this series, MLOps Architecture: Building your MLOps Pipeline.

1“MLOps: Making Sense of a Hot Mess,”VentureBeat, 1 Aug 2022.
2“Democratizing Transformation,” Harvard Business Review, May 2022.
3 “Why AI and Machine Learning are Drifting Away from the Cloud,”Protocol, 1 Aug 2022.