Modzy is an enterprise software platform equipped to manage and host machine learning models built in any programming language or framework – at scale. At the heart of this capability is Modzy’s abstraction for a machine learning model which enables the platform to use containers and container orchestration tools to manage machine learning models. As such, the internal details of any model can be opaque to the platform, so long as the model is packaged within a container that implements a common interface specification.

Origins of a model interface

In Modzy’s 1.0 release of the model interface, it was built as a specification for a ReST application programming interface (API) that could be instantiated as a Web server run from within a container. Not only does this design foster standardization, but it also provides the additional benefit of allowing for easy packaging and portability of models. The endpoints defined and exposed from the server all capture near-universal tasks required of machine learning models that are intended to run within a production setting. Included is the ability to instantiate and load a model as well as run inputs against it in order to obtain inference results.

This abstraction proved to be powerful, and through an open-source instantiation of Modzy’s 1.0 template in Python using the Flask framework, the Modzy team was able to make it highly accessible. This empowered Modzy data scientists, partners, and customers to launch and run models on the Modzy platform with ease.

The next Iteration

The confluence of several demand signals from our clients served as a catalyst for reinvestigating the template to build a 2.0 container template. On one hand, being able to run models on the edge to handle large quantities of data often ingested as a stream was something we frequently encountered as a workload. On the other hand, the ability to perform drift detection, explainability as high throughput, and batch workloads was another popular feature that we could address through the research coming out of the Modzy Labs team.

Accommodating these features and doubling down on model security ensured users of the 1.0 template would have a guided, seamless experience transitioning to the new template. These were the principal design considerations when we designed the 2.0 container specification.

The rise of gRPC

The Modzy team opted to swap out the ReST API using JSON and running over HTTP/1.1 with a gRPC service using protocol buffers running over HTTP/2. The benefits of adopting gRPC for this container were manyfold. This approach opened the door for bidirectional streaming between client and server to allow for more complex interaction patterns and improvements in throughput. The gRPC client code also proved to be less complex than its ReST counterpart. A versioned controlled protocol buffer specification simplified the task of maintaining and versioning the API as well as auto-generating clients in each of the popular programming languages.

Container 2.0 feature set

The development of the 2.0 template coincided with a push to support UDP input streams on the Modzy platform to enable edge workloads and real-time inference with drift and explainability. By leveraging the powerful features of gRPC in conjunction with the design of schemas for the most common types of models exposing drift and explainability, the Modzy team opened the door for platform supported explainability and drift. At the same time, this approach ensured extensibility for users looking to employ their own drift and explainability solutions. Lastly, a suite of DISA compliant containers was rolled out in order to lay a foundation of enhanced security for all models produced by the Modzy team.

With these ingredients, the new 2.0 container specification and accompanying template is easier to develop, use, and faster and more secure than ever. Now, the 2.0 Modzy model template has an open-source reference template in Python for users—complete with documentation designed to guide new users through the process of getting their model production-ready. For a deeper dive on the tech, check it out on Github at https://github.com/modzy/grpc-model-template