TorchVision has a new backward compatibility API for building models with multi-weight support. The new API allows loading different pre-trained weights on the same model variant, keeps track of vital meta-data such as the classification labels and includes the preprocessing transforms necessary for using the models.
Limitations of the current API
TorchVision currently provides pre-trained models which could be a starting point for transfer learning or used as-is in computer vision applications. The new API addresses the above limitations and reduces the amount of boilerplate code needed for standard tasks.
Associated meta-data & preprocessing transforms
The weights of each model are associated with meta-data. The type of information we store depends on the task of the model (classification, detection, segmentation, etc.) Typical information includes a link to the training recipe, the interpolation mode, and the categories and validation metrics.
Additionally, each weight entry is associated with the necessary preprocessing transforms. All current preprocessing transforms are JIT-scriptable and can be accessed via the transforms attribute. Previously, using them with the data, the transforms needed to be initialised/constructed. This lazy initialisation scheme is done to ensure the solution is memory efficient. The input of the transforms can be either a PIL.Image or a Tensor read using torchvision.io.
Associating the weights with their meta-data and preprocessing will boost transparency, improve reproducibility and make it easier to document how a set of weights were produced.
Get weights by name
The ability to directly link the weights with their properties (metadata, preprocessing callables, etc.) is why this implementation uses Enums instead of Strings. Nevertheless, for cases when only the name of the weights is available, there is a method offered capable of linking weight names to their Enums.
The new API’s boolean pretrained and pretrained_backbone parameters, which were previously used to load weights to the full model or to its backbone, are deprecated. However, the current implementation is fully backwards compatible as it seamlessly maps the old parameters to the new ones.