A major announcement from the AWS re:Invent event is the launch of the custom machine learning chip by the company — Trainium. It is the second ML chip by AWS after Inferentia, which was launched amid much fanfare at last year’s event. While both of them share the same AWS Neuron SDK, Trainium claims to provide better performance for training ML models in the cloud, in a cost-effective manner. With the support of TensorFlow, PyTorch and MXNet, Trainium will offer one of the best performances with the most teraflops of computing power for machine learning in the cloud.
It can perform a variety of deep learning training workloads across applications such as image classification, translation, voice recognition, natural language processing, recommendation engines, among others.
Sign up for your weekly dose of what's up in emerging technology.
How It Takes On Inferentia
As the computer-intensive workloads are increasing, the need for high-efficiency chips is growing dramatically. AWS is expanding its custom chip capabilities to meet these demands in the end-to-end ML lifecycle. Its larger aim also revolves around making deep learning pervasive for everyday developers and democratise access to cutting edge infrastructure in an affordable manner.
Building upon these end-goals, Trainium offers the highest performance with the most teraflops of computing power while enabling a wide range of ML applications. It is to be noted that in one teraflop, a chip can process around one trillion calculations in a second.
Most importantly, Trainium has been essentially launched to address the shortcomings of Inferentia. While both provide an end-to-end flow of ML compute — from scaling training workloads to deploying accelerated inference, the latter is a more cost-effective option that can carry a more extensive range of ML workloads.
Inferentia has so far been providing good results in ML tasks. However, the increased applications of ML-based working has resulted in the need to improve performance driven by inference and training, while keeping the costs in tight check. Trainium addresses these issues by ensuring that customers have an end-to-end flow of ML compute from training workloads to deploying accelerated inference. Trainium chips offer high performance, low latency and flexibility.
“While the cost of inference, which accounts for up to 90% of ML infrastructure costs, was addressed by Inferentia, many development teams are still constrained by fixed ML training budgets. This puts a limit on the extent and frequency of training necessary to enhance their models and applications. By offering the highest performance and lowest cost for cloud ML preparation, AWS Trainium answers this challenge,” stated the company.
The company believes that with the combination of Trainium and Inferentia, they will be able to offer an end-to-end flow of ML computing from workload scaling training to rapid inference deployment.
Furthermore, AWS is collaborating with Intel to introduce EC2 instances for machine learning training based on Habana Gaudi while will improve and deliver up to 40% better price and output by next year.
Can researchers currently working with Inferentia switch to Trainium? As they both share the same AWS Neuron SDK, it will make it easy for developers using Inferentia to get started with Trainium. The company notes that developers can easily migrate from GPU-based instances to Trianium with minimal code changes.
While Trainium can be compared to Google’s tensor processing units which are their AI training workloads hosted in Google Cloud Platform, the offerings are different at many levels and a clear comparison cannot be made at this point of time. It can also compete with some of the newly launched AI chips this year such as IBM Power10 — which claims to be three times more efficient than the previous models of the POWER CPU series; or NVIDIA A100 — which claims to offer 6x higher performance than NVIDIA’s previous-generation chips.
Having said that, with these new chips, AWS is aiming big, targeting enterprises to help them train ML models in an efficient and cost-effective way, and help them build stronger AI strategies.