Production is often viewed as the final frontier in the machine learning process. By now, your data scientists trained a model on your data, the machine learning and software engineers incorporated that model into an application, the DevOps team configured the automation that containerizes the application for use by the rest of the organization, and the IT department set up infrastructure to host your model’s application. At this point, most program managers flip the proverbial switch, allow users to rely on the solution, and move on to the next thing.
That’s also the wrong thing to do.
This blog in the ModelOps blog series covers the model production step in the ModelOps pipeline, or AI in production, and the active management required to successfully field a machine learning model. See the previous post in the series for more on AI model deployment, the previous step in the process.
Moving an AI model into production has 3 main hurdles:
- Making the model available to users
- Monitoring the model’s performance
- Providing feedback to your data scientists so that the model can be improved
Model Availability
While making a model available seems straightforward, there are a number of barriers right out of the gate. At this point in the process, the temptation is to simply boot up and run the dockerized container on the IT-provided infrastructure. The problem with that approach is that it lacks any sort of governance, and calls to mind a number of questions:
- Who should be able to access the model? Can IT restrict access to the appropriate people?
- Where is the data being stored? Is the data storage temporary or indefinite?
- Is the data or the model restricted by corporate policy or law because of HR or personally identifiable information (PII) data?
Many homegrown production systems fail to consider these factors, and as a result can open your organization up to a number of risks. To successfully mitigate these challenges when moving AI models into production systems, Data Science, IT, DevOps, and Policy groups must be engaged and working together from the start. This includes developing a multi-year strategy for how models will be used and maintained agreed to by all stakeholders. Modzy was designed to resolve these issues through built-in governance functionality that makes models discoverable and restrict-able, while also allowing organizations to make the data being processed as ephemeral as an organization needs.
Model Monitoring
Monitoring a model’s performance is an oft-overlooked requirement during development, but it is vital for successful deployment and for maintaining a model’s performance over time. To really manage your model, it is very important to measure how well its performance meets users’ expectations. This means being able to identify false alarms, flag data drift, log anomalous run times, and patch bugs for non-model-related issues. These monitoring capabilities provide an extra layer to development in homegrown systems that can drastically increase investment costs, but without them, you run the risk of fielding a model that fails to be adopted by users because of issues that have nothing to do with a model’s efficacy:
- Software bugs you didn’t find in QA (if you QAed)
- Users making decisions based on bad inference results because they unknowingly processed data through your model that differs too much from the model’s training data
- Infrastructure issues that delay or prevent your model from serving users
All these challenges can unfairly give your solution and investment a bad reputation if you fail to monitor performance. Modzy addresses these issues with a set of targeted monitoring capabilities that keep track of not only model performance through explainability and data drift detection but also processing time metrics to flag inference issues, and feedback mechanisms for users to identify good and bad model outputs.
Creating a Feedback Loop for Data Scientists
Assuming you’ve addressed the previously mentioned issues with availability and monitoring, the next challenge is making sure there is a feedback loop to the team that originally developed your models. By creating the feedback loop, you’ve actually built out a pipeline for ModelOps. Many organizations ignore this part of the process today, essentially deploying a model and failing to account for the fact that it will need to be updated and maintained over time.
No piece of software has ever achieved this kind of perfection, and so as a decision-maker, it is imperative that lifecycle management should be incorporated into your production vision. Modzy has numerous features that account for these feedback issues, such as recording model outputs, drift assessments, user accuracy feedback, and more. All the details for inference success, failure, and related metrics are downloadable to users and can be used to update and improve your models.
As we’ve discussed throughout this blog series, getting reliable machine learning models into production is hard. AI in production requires a diverse set of skills just to make a model available to a single user, but a packaged model running inferences won’t withstand the test of time. For success in the long run, enterprises must implement a three-pronged approach for production defined by governance, monitoring, and a continuous feedback loop. Fortunately, tools like Modzy were designed to help your organization overcome these obstacles, provide a single location where all stakeholders involved can collaborate to manage your AI projects, and serve as the center of your ModelOps pipeline.
To learn more about ModelOps, visit Modzy.com.