Last week, Tesla Motors made the news for delivering big with its smart summon feature on their cars. The cars can now be made to move around the parking lot with just a click. A myriad of tools and frameworks run in the background which makes Tesla’s futuristic features a great success. One such framework is PyTorch.
PyTorch has gained popularity over the past couple of years and it is now powering the fully autonomous objectives of Tesla motors.
During a talk for the recently-concluded PyTorch developer conference, Andrej Karpathy, who plays a key role in Tesla’s self-driving capabilities, spoke about how the full AI stack utilises PyTorch in the background.
Tesla WorkFlow With PyTorch
Tesla Motors is known for pioneering the self-driving vehicle revolution in the world. They are also known for achieving high reliability in autonomous vehicles without the use of either LIDAR or high definition maps. Tesla cars depend entirely upon computer vision.
Tesla is fairly a vertical integrated company and that is also true when it comes to the intelligence of the autopilot. Everything that goes into making Tesla autopilot the best in the world is based on machine learning and the raw video streams that come from 8 cameras around the vehicle. The footage from these cameras is processed through convolutional neural networks(CNNs) for object detection and performing other actions eventually.
The collected data is labelled, training is done on on-premise GPU clusters and then it is taken through the entire stack. The networks are run on Tesla’s own custom hardware giving them full control over the lifecycle of all these features, which are deployed to almost 4 million Teslas around the world.
For instance, a single frame from the footage of a single camera can contain the following:
- Road markings
- Traffic lights
- Overhead signs
- Moving objects
- Static objects
- Environment tags
So this quickly becomes a multi-task setting. A typical Tesla computer vision workflow would have all these tasks that are connected to a ResNet-50 like shared backbone running roughly on 1000 x 1000 images. Shared because having neural networks for every single task is costly and inefficient. These shared backbone or the networks are called Hydra Nets.
There are multiple Hydra Nets for multiple tasks and the information gathered from all these networks can be used to solve recurring tasks. This requires a combination of data-parallel and model-parallel training.
Multi-task training can be done through three main ways as illustrated above. Though round-robin looks straight forward, it is not effective because a pool of workers would be more effective in doing multiple tasks simultaneously. And, the engineers at Tesla find PyTorch to be well suited for carrying out this multitasking.
For autopilot, Tesla trains around 48 networks that do 1,000 different predictions and it takes 70,000 GPU hours. Moreover, this training is not a one-time affair but an iterative one and all these workflows should be automated while making sure that these 1,000 different predictions don’t regress over time.
When it comes to machine learning frameworks, TensorFlow and PyTorch are widely popular with the practitioners. No other framework comes even remotely close to what these two products of Google and Facebook respectively, have in store. These frameworks have slowly found their niche within the AI community. PyTorch, especially has become the go-to framework for machine learning researchers.
PyTorch citations in papers on ArXiv grew 194% in the first half of 2019 alone while the number of contributors to the platform has grown more than 50%.
Companies like Microsoft, Uber, and other organisations across industries are increasingly using it as the foundation for their most important machine learning research and production workloads.
And now with the release of PyTorch 1.3, the platform got a much needed boost as it now includes experimental support for features such as seamless model deployment to mobile devices, model quantisation for better performance at inference time, and front-end improvements, like the ability to name tensors and create clearer code with less need for inline comments.
The team also has plans in place for launching a number of additional tools and libraries to support model interpretability and bringing multimodal research to production.
Additionally, PyTorch team have also collaborated with Google and Salesforce to add broad support for Cloud Tensor Processing Units, providing a significantly accelerated option for training large-scale deep neural networks.
These timely releases from PyTorch coincide with the self-imposed deadlines of Elon Musk on his Tesla team. With the success of their Smart summon, Tesla aims to go fully autonomous in the next couple of years and we can safely say that it has rightly chosen PyTorch to do the heavy lifting.