Deploy Scalable Machine Learning Models for Long-Term Sustainability – The New Stack


The promise of machine learning (ML) models is causing businesses to proliferate, but deploying them in cloud or cutting edge computing environments is proving to be a tall order in evolving a plethora of tools and frameworks. According to Algorithmics2020 ML Company Status“On average, 40% of companies said it takes more than a month to roll out an ML model into production. “

In this episode of The New Stack Makers podcast recorded at AWS re: Invent, Luis Cèze, co-founder and CEO of OctoML explains how to optimize and automate the deployment of machine learning models to any hardware, cloud device, or device.

Alex Williams, Founder and Publisher of The New Stack, hosted this podcast.

Deploying Scalable Machine Learning Models for Long-Term Sustainability

Also available on Apple podcasts, Google podcasts, Covered, PlayerFM, Pocket casts, Spotify, Stapler, To agree

From automating machine learning models to designing, managing and optimizing, the ability to quickly deliver these models to production while reducing costs and keeping them on time is a challenge for developer productivity. . “Machine learning models are very computationally intensive, and once you’re ready to deploy a model, you need to understand the hardware and really optimize it to prepare it for deployments,” Ceze said.

Based on Apache Tensor Virtual Machine (TVM), an open source machine learning compiler framework created by Ceze and its co-founders, OctoML builds on this tool that allows developers to automatically optimize machine learning and deploy it to run at scale. ladder on any material. “Apache TVM creates a set of common primitives on all kinds of different hardware… Then it uses machine learning internally to produce efficient machine learning code. The reason this is important is that the work done to prepare a model for deployment involves a lot of manual software engineering, ”Ceze said.

By using machine learning, Apache TVM helps companies streamline the information needed to deploy machine learning models, Ceze said. “Once you get a machine learning model, there are billions of ways you can actually produce the code to represent your model and run it on target hardware. Ceze asked, “How do you pick the fastest when you don’t have time to run them all?” And there are billions of them, it’s not practical.

With a complex toolchain and fragmented ecosystem, the practice of putting machine learning models into production will seemingly be overwhelming because “models are updated frequently as changes in data or better ideas can make the model more precise and efficient for whatever it needs. a need to retrain the model, ”Ceze said. “If you don’t have automation, every time there is a change in your model, you have to do some manual work to get it ready for the delivery cycle. “


About Author

Comments are closed.