How PyTorch And AWS Come To The Rescue Of ML Models In Production

“Today, more than 83% of the cloud-based PyTorch projects happen on AWS.” 

The Computer Vision Developer Conference(CVDC) 2020 is a two day event(13-14th Aug) organized by Association of Data Scientists (ADaSci). ADaSci is a premier global professional body of data science & machine learning professionals. Apart from the tech talks covering a wide range of topics, CVDC 2020 also flaunts paper presentations, exhibitions & hackathons. There is also a full day workshop on computer vision that comes with a participation certificate for the attendees.

CVDC2020 kicked off with Suman Debnath’s talk on how to ‘Deploy PyTorch models in Production on AWS with TorchServe’. Suman is a Principal Developer Advocate at AWS. Prior to joining AWS, he worked at various organisations like IBM Software Lab, EMC, NetApp and Toshiba. 

Why TorchServe

Though PyTorch seen a sudden rise in popularity amongst ML practitioners, it does come with few challenges:

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
  • no official model server, 
  • developers need to write custom code and 
  • one has to build own systems for scaling, security etc 

“TorchServe addresses the difficulty of deploying PyTorch models.”

Today, more than 83% of the cloud-based PyTorch projects happen on AWS. So, it is crucial to address these challenges. This is where TorchServe comes in handy. TorchServe, a PyTorch model-serving library that makes it easy to deploy trained models at scale without writing custom code. TorchServe was developed by AWS in partnership with Facebook. TorchServe addresses the difficulty of deploying PyTorch models. 

Model serving is the process of situating a trained ML model within a system so that it can take new inputs and return inferences to the system. TorchServe allows users to expose webAPI for their model that can be accessed directly or via application.

Download our Mobile App

Excerpts From The Talk

In this intriguing talk, Suman detailed how to deploy and manage machine learning models in production, which is often considered to be the most challenging part in an ML pipeline. Suman, who has vast experience of working with AWS cloud services introduced the attendees to the many advantages of using AWS in conjunction with PyTorch. With TorchServe, one can deploy PyTorch models in either eager or graph mode using TorchScript, serve multiple models simultaneously, version production models for A/B testing, load and unload models dynamically, and many more. 

Using an EC2 instance as a VM, Suman demonstrated how to launch TorchServe. Here’s a snippet of code that gives an idea of the working of TorchServe:

Install torchserve and torch-model-archiver

pip install torchserve torch-model-archiver

To serve a model with TorchServe, first archive the model as a MAR file. 

Download a trained model.


To get predictions from a model, test the model server by sending a request to the server’s predictions API.

Know more here.

Talking about the real world applications of TorchServe, Suman cited the examples of Toyota and Matroid. While Toyota Research Institute Advanced Development, Inc. (TRI-AD) is training their computer vision models with PyTorch, the framework lacking a model serving framework. As a result, the car maker spent significant engineering effort in creating and maintaining software for deploying PyTorch models to our fleet of vehicles and cloud servers. 

With TorchServe, Toyota now has a performant and lightweight model server that is officially supported and maintained by AWS and the PyTorch community. Whereas, in case of Matroid, a maker of computer vision software, TorchServe allows them to simplify model deployment using a single servable file that also serves as the single source of truth, and is easy to share and manage.

Stay tuned for more updates from CVDC 2020.

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Council Post: The Rise of Generative AI and Living Content

In this era of content, the use of technology, such as AI and data analytics, is becoming increasingly important as it can help content creators personalise their content, improve its quality, and reach their target audience with greater efficacy. AI writing has arrived and is here to stay. Once we overcome the initial need to cling to our conventional methods, we can begin to be more receptive to the tremendous opportunities that these technologies present.