In a previous installment of the ModelOps blog series, we discussed deploying your models to Modzy, and in the last blog, we discussed moving them into production. In this final installment of the series, we’ll discuss how easy it is to capitalize on your newly-deployed models, and how to ensure they remain accurate by leveraging Modzy’s monitoring, drift detection, and explainability features.

Readiness and Scalability

Once your model is deployed to Modzy, it is ready for scalable production inference. Modzy can handle the resource scaling required for models to run according to varying workloads. This is done by using “Processing Engines”, which are infrastructure units used to run models. Modzy allows you to specify how infrastructure resources are allocated as jobs are submitted to different models, allowing you to maintain control of both costs and latency. If there are models which need to always be ready to accept and quickly process inference jobs, you can specify the processing engines to be one or greater, so that the models will always be spun up and ready. If you don’t want to incur the infrastructure costs associated with always-on models, you can keep the processing engine minimum at 0. Although there will be some latency when a job comes in, you’re in control of both costs and latency, allowing you to decide which variable is most important by model, and by use case. Finally, you can also increase the maximum number of processing engines to accommodate large workloads. Once you’ve specified your preferences, Modzy handles provisioning and scaling of processing engines according to varying workloads within your predetermined parameters.

Production Monitoring

Now that you’ve configured resource scaling for your models, you’re ready to start using them in production applications. You can submit inference jobs to Modzy models from anywhere as long as you can submit HTTP requests. The API routes required to submit a job, check its status, and finally retrieve the inference results are well-documented here. However, if you’re writing your applications in Python, Java, JavaScript, or Golang, we provide SDKs which make this process even easier. Using these SDKs, you can add AI power to your applications with just a few lines of code.

As the previous blog mentioned, once a model is deployed to production and inferences are being run, the model must be monitored and maintained over time – this is one of the most important parts of the ModelOps pipeline. Models deployed to production settings should be monitored in several ways, and Modzy provides functionality to allow you to do that.

Drift Detection

One way that Modzy allows you to monitor your models is through its drift detection features. Drift is a phenomenon in which the data fed to a model for inference is too different from the data used to originally train the model. If detected, drift can indicate that model retraining may be required. Modzy provides functionality to detect drift and alert you accordingly. To learn more, check out our blog post on Data Drift Detection.


Additionally, there are times when it would be helpful to understand a model’s predictions without the help of a data scientist. Modzy’s explainability features allow you to intuitively understand why a model is making a certain prediction. For example, for image classification models Modzy will highlight the region in the input image determined to be most important, and for text classification models Modzy will color code the words in the input text according to their determined importance. Not only does this level of transparency mean that any stakeholder, regardless of technical acumen, will be able to understand the “why” behind predictions, but it can also help you determine whether or not models are making predictions for the right reasons.

API Security

Finally, as mentioned in the previous section, Modzy automatically logs all inference jobs submitted to the model for auditing purposes, and these logs are easily accessible in the Job History page or via the appropriate API calls. This provides you an audit trail of all activity taken within Modzy, allowing you to have a complete view of your organization’s AI health. In addition to the ability to monitor and control costs, Modzy also allows you to specify role-based access controls to ensure information remains restricted to appropriate parties, ensuring both API security and governance.

Modzy was designed with the entire ModelOps pipeline in mind. Therefore, it provides a wealth of functionality to support you every step of the way and help you maximize the return on your AI investment. For more information on building and maintaining your ModelOps pipeline, check out all of the previous blog posts in this series: