For over two years, Tesla’s CEO Elon Musk has been teasing the development of Tesla’s supercomputer called “Dojo.” Last year, he even claimed that Tesla’s Dojo would have a capacity of over an exaflop, which is one quintillion (1018) floating-point operations per second.
Tesla has unveiled this Dojo supercomputer for the first time at the AI Day even though Elon Musk has been talking it up on Twitter for almost a year now. Tesla has claimed this to be the world’s fastest computer for training ML algorithms.
This chip has been created from the ground up level in-house at Tesla. It is mainly essential for computer vision for self-driving cars using cameras. Tesla has been collecting from over 1 million vehicles to train the neural network using the Dojo supercomputer.
Specifications & Claims:
Dojo is powered by Tesla’s custom chip, D1, which contains 7nm technology and 362 teraflops of processing power. Tesla claims their D1 Dojo chip has a GPU level compute and CPU level flexibility, with networking switch IO.
The chips can seamlessly connect without any glue to each other. Tesla took advantage of that by connecting 500,000 nodes. The result is a nine pFLOPS training tile with 36TB per second of bandwidth in a less than one cubic foot format. Tesla has done the above by going against the general industry norm of cutting the wafer into pieces. It is leaving 25 SoCs on the wafer and using high-quality silicon. These will allow the chips to communicate with each other without any loss in speed while maintaining the motherboard’s quality.
Tesla only needs 120 fully functional wafers for Dojo. By comparison, in 2014, Intel made more than 130,000 300mm wafers. But, since Dojo uses five by five sectioned small wafers, the costs should be significantly lower.
The company also claims that the computer does not have any RAM outside of the SoC. To illustrate, even a smartphone, the latest hard drives, and Tesla’s HW3 have RAM chips outside of the SoC. Tesla is instead using cache, a faster tier of random access memory.
Dojo Vs Competitors
Chipmaker Intel, graphics card maker Nvidia, and start-up Graphcore are among the companies that make chips used to train AI models.
The Intel Mobileye Q4 chip is capable of performing 2.5 tops while consuming 3 watts of power. The chip is structured to support a fully autonomous driving system. Each EyeQ chip consists of heterogeneous and fully programmable accelerators optimised for its own family of algorithms. The chip has generic multi-thread CPU cores, making it a robust computing platform for ADAS/AV applications. It supports a 40Gbps data bandwidth and has the additional capacity to support more sensors via PCIe and Gigabit Ethernet ports.
Up until now, Tesla has been using the Nvidia Drive PX2 chip to implement their autopilot and accelerate the production of autonomous vehicles. The chip can understand the vehicle’s environment in real-time, locate itself on a map, and use the information to plan a safe path. Nvidia has claimed it to be the world’s most advanced self-driving car platform that combines deep learning with sensor fusion and surrounds vision. The chip’s configuration consists of a mobile processor that can operate at 10 watts and is converted to a multi-chip configuration with two mobile processors and two discrete GPUs. The GPUs can deliver 24 trillion deep learning operations per second.
Another chip in the Driver series, PX Xavier, consumes only 20 watts of power while delivering 20 TOPS of performance. It is packed with 7 billion transistors. Nvidia Drive Pegasus uses the power for two Xavier SoCs and Nvidia’s Turing architecture to ensure the capacity of 320 TOPS while consuming 500 watts. Nvidia is designing the platform for Level 4 and Level 5 autonomous systems.
Though, among graphics cards, NVIDIA’s GA100 Ampere SoC still comes out on top with 54 billion transistors.
Living Up To The Hype
Tesla has a history of having come up with brilliant concepts that never materialised. So many of the products that Musk has hyped up during events were barely commercialised, aren’t available yet, or were eliminated at the ideation level. These include products like Model X back in 2012, to more recent ones like Tesla Energy or the Solar Roof. The company has not published an official research paper on the supercomputer either.
Tesla enthusiasts have taken to Twitter to express their mixed views on the Dojo chip. While some are questioning Tesla’s claims with ‘Designing a chip is easy. What’s hard is building the compiler, runtime scheduler in an HPC environment at scale’ or ‘Dojo is the CPAP ventilator of machine learning runtimes’, others believe that Tesla is giving the competitors a run for their money.
Tesla claims that the chip can allow for 10x improvement in the next iteration of the computing appliance. When it comes to Dojo, it is essential to remember that it is only Tesla’s claims and design that will be compared to existing and running models from Nvidia, Intel, and Graphcore. Tesla hasn’t put the entire system together, but Musk believes it will be fully operational next year.