Active Hackathon

ML Ecosystem Gets Mature With The Release Of PyTorch Hub


The rate at which machine learning enhancements get published has increased over the past couple of years. There are significant models like BERT for NLP tasks which are difficult to reproduce.


Sign up for your weekly dose of what's up in emerging technology.

While many of these publications are accompanied by code as well as trained models which is helpful but still leaves a number of steps for users to figure out for themselves.

There are sources like papers with the code to help developers assist with implementing the breakthroughs but at the end of the day, any progress is measured by the ease of implementation which requires reproducibility.

To address all these issues, Facebook AI team behind PyTorch released PyTorch hub.

A Hub For Easy Deployment

PyTorch has been the go-to platform for building deep learning models. PyTorch enables fast, flexible experimentation and efficient production through a hybrid front-end, distributed training, and ecosystem of tools and libraries.

PyTorch hub is a simple API and workflow that provides the basic building blocks for improving machine learning research reproducibility. PyTorch Hub consists of a pre-trained model repository designed specifically to facilitate research reproducibility and enable new research.

It also has built-in support for Colab, integration with Papers With Code and currently contains a broad set of models that include Classification and Segmentation, Generative, Transformers, etc.

PyTorch Hub supports the publication of pre-trained models (model definitions and pre-trained weights) to a GitHub repository by adding a simple file. This provides an enumeration of which models are to be supported and a list of dependencies needed to run the models.

Why model implementation is easy:

  • Each model file can function and be executed independently
  • They don’t require any package other than PyTorch (encoded in as dependencies[‘torch’])
  • They don’t need separate entry-points, because the models when created, work seamlessly out of the box.
  • Minimizing package dependencies reduces the friction for users to load the model for immediate experimentation.

Explore And Load Easily

Bidirectional  Encoder Representations from Transformers or BERT, which was open sourced late last year, offered a new ground to embattle the intricacies involved in understanding the language models. BERT uses WordPiece embeddings with a 30,000 token vocabulary and learned positional embeddings with supported sequence lengths up to 512 tokens.

BERT helped explore the unsupervised pre-training of natural language understanding systems.

PyTorch hub enables calling BERT with just a few lines of code.

Here is a code snippet to specify an entry point of the bertForMaskedLMmodel, which returns the pre-trained model weights.

def bertForMaskedLM(*args, **kwargs):

   model = BertForMaskedLM.from_pretrained(*args, **kwargs)

   return model

PyTorch Hub also allows auxiliary entry points (other than pretrained models), e.g. bertTokenizer for preprocessing in the BERT models, to make the user workflow smoother.

Users can explore every available entry points in a repo using the torch.hub.list() API.

>>> torch.hub.list(‘pytorch/vision’)









Users can load a model entrypoint using the torch.hub.load() API.  In addition the API can provide useful information about how to instantiate the model.

print(‘pytorch/vision’, ‘deeplabv3_resnet101’))

model = torch.hub.load(‘pytorch/vision’, ‘deeplabv3_resnet101’, pretrained=True)

PyTorch Hub makes it super simple for users to get the latest update by calling:

model = torch.hub.load(, force_reload=True)

This will help to alleviate the burden of repetitive package releases by repo owners and instead allow them to focus more on their research. It also ensures that, as a user, you are getting the freshest available models.

Know more about PyTorch hub here

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM