Deemed as a research-only library earlier, PyTorch gained traction among developers for a wide range of data science workflows. To further enhance the user experience, Facebook released PyTorch 1.5, which includes new API additions and improvements to simplify the deep learning workflows. This release has majorly focused on increasing its ambit by allowing C++ users to deploy various machine learning techniques effectively. Besides, the latest version also encloses ‘channel last’ memory format for computer vision models and RPC framework for model-parallel training. Furthermore, PyTorch 1.5 will no longer support Python 2, and in future releases, it will be limited to Python 3, specifically 3.5 and above.
Major Functionality For High-Speed Requirements
Previously tagged as experimental, the C++ frontend API is now under stable and ready to be used flawlessly. PyTorch C++ frontend interface was a highly awaited functionality as it will further extend the use of the library in a wide variety of applications. Data scientists and researchers mostly loved Python interface due to its simplicity and flexibility, but it left behind huge codebases, for those interested in 3D graphics designing, gaming and more. This is because in photo editing software and backend development of graphics, developers need speed, and Python is not the best fit for such tasks.
Now that C++ API has moved under stable, it will further enhance the adoption of the library among existing C++ codebases, enabling them to work in low latency systems and highly multithreaded environments.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.
Enhancements:

- With this new stable release of the functionality, users will be able to easily translate their model from Python API to C++ API with the torch::nn module
- Fixed the optimisers in C++ that were deviated from the Python equivalent
- Introduction of tensor multi-dim indexing API in C++ – one of the most anticipated fix
New:
To allow users to bind custom C++ classes into TorchScript and Python, the firm has introduced torch.CustomClassHolder. It is similar in syntax to `pybind11`, allowing C++ class and its methods to manipulate arbitrary C++ objects from TorchScrip and Python.
Channel Last For Computer Vision
For streamlining the computer vision workflows, PyTorch now comes with ‘channel last’ memory layout that enables users to use performance efficient convolution algorithms and hardware – NVIDIA’s Tensor Cores, FBGEMM, QNNPACK. Besides, it is built to propagate through the operators while switching between memory layouts automatically.
Although this feature is released under the experimental tag, writing memory format aware operators will assist in generating output in the same memory format as inputs and use the most efficient kernels for various memory formats.
Distributed RPC Framework APIs
Distributed RPC allows developers to run functions remotely, supports referencing remote objects without copying the real data around, among others. After fixing the bugs and enhancing the functionality, the feature has now been moved under stable release to simplify the workflow of data scientists. Some of the APIs within this framework include:
- RPC API: The three main APIs — rpc_sync(), rpc_async(), and remote() allows data scientists to handle return value effectively for a wide range of tasks. It also enables users to specify functions to run and objects to be instantiated on remote nodes. Such functions are recorded so that gradients can backpropagate through remote needs using distributed autograd.
- Distributed Autograd: It stitches the autograd graph across various nodes and enables gradients to flow during the backwards pass. Unlike .grad field, gradients are accumulated into a context for which users must specify their model’s forward pass using dist_autograd.context() to ensure RPC communication is recorded.
- Distributed Optimizer: In the case of distributed forward and backward passes, optimiser helps in managing the parameters and gradients that are scattered. It creates RRefs to optimise on each worker with parameters that require gradients and then uses the RPC API to run remotely.
Get detailed release notes here.