“The SoC is the new motherboard.”Google Cloud
Data centres are no longer betting on the one-size-fits-all compute. Decades of homogenous compute strategies are disrupted by the need to optimise. Modern-day data centres are embracing purpose-built System on Chip (SoC) designs to have more control over peak performance, optimise power consumption and scalability. Thus, customisation of chips has become the go-to solution for many cloud providers. Companies like Google Cloud especially are doubling down on this front.
How Cloud Giants Are Embracing The Silicon
Google introduced the Tensor Processing Unit (TPU) back in 2015. Today TPUs power services such as real-time voice search, photo object recognition, and interactive language translation. TPUs drive DeepMind’s powerful AlphaGo algorithms, which outclassed the world’s best Go player. They were later used for Chess and Shogi. Today, TPUs have the power to process over 100 million photos a day. Most importantly, TPUs are also used for Google’s search results. The search giant even unveiled OpenTitan, the first open-source silicon root-of-trust project. The company’s custom hardware solutions range from SSDs, to hard drives, network switches, and network interface cards—often in deep collaboration with external partners.
“Workloads demand even deeper integration into the underlying hardware.”
Just like on a motherboard, CPUs and TPUs come from different sources. A Google data centre consists of thousands of server machines connected to a local network. Google designs custom chips, including a hardware security chip currently being deployed on both servers and peripherals. According to Google Cloud, these chips allow them to securely identify and authenticate legitimate Google devices at the hardware level.
According to the team at GCP, computing at Google is at a critical inflection point. Instead of integrating components on a motherboard, Google focuses more on SoC designs where multiple functions sit on the same chip or on multiple chips inside one package. The company even claimed that the System on Chips is the modern-day motherboard.
To date, writes Amin Vahdat of GCP, the motherboard has been the integration point, where CPUs, networking, storage devices, custom accelerators, memory, all from different vendors blended into an optimised system. However, the cloud providers, especially companies like Google Cloud, AWS which own large data centres, gravitate towards deeper integration in the underlying hardware to gain higher performance at lesser power consumption.
According to ARM — acquired by NVIDIA recently — renewed interest towards design freedom and system optimisation has led to higher compute utilisation, improved performance-power ratios, and the ability to get more out of a physical datacenter.
For example, AWS Graviton2 instances, using the Arm Neoverse N1 platform, deliver up to 40 percent better price-performance over the previous x86-based instances at a 20 percent lower price. Silicon solutions such as Ampere’s Altra are designed to deliver performance-per-watt, flexibility, and scalability their customers demand.
“The capabilities of cloud instances rely on the underlying architectures and microarchitectures that power the hardware.”
Amazon has made its silicon ambitions obvious as early as 2015. Amazon acquired Israel-based Annapurna Labs, known for networking-focused Arm SoCs. Amazon leveraged Annapurna Labs’ tech to build a custom Arm server-grade chip—Graviton2. After its release, Graviton2 locked horns with Intel and AMD, the data centre chip industry’s major players. While the Graviton2 instance offered 64 physical cores, AMD or Intel could manage only 32 physical cores.
Last year, AWS even launched custom-built AWS Inferentia chips for the hardware specialisation department. Inferentia’s performance convinced AWS to deploy them for their popular Alexa services that require state of the art ML for speech processing and other tasks.
Amazon’s popular EC2 instances are now powered by AWS Inferentia chips that can deliver up to 30% higher throughput and up to 45% lower cost per inference. Whereas, Amazon EC2 F1 instances use FPGAs to enable delivery of custom hardware accelerations. F1 instances are easy to program and come with an FPGA Developer AMI and support hardware level development on the cloud. Examples of target applications that can benefit from F1 instance acceleration include genomics, search/analytics, image and video processing, network security, electronic design automation (EDA), image and file compression and big data analytics.
Followed by AWS Inferentia’s success in providing customers with high-performance ML inference at the lowest cost in the cloud, AWS is launching Trainium to address the shortcomings of Inferentia. The Trainium chip is specifically optimised for deep learning training workloads for applications including image classification, semantic search, translation, voice recognition, natural language processing and recommendation engines.
The above table is a performance comparison by Anandtech, which shows how the cloud providers can ditch the legacy chip makers, thanks to ARM’s license provisions. Even Microsoft is reportedly building an ARM-based processor for Azure data centres. Apart from custom chips that’s under wraps, Microsoft too had a shot at silicon success. They have collaborated with AMD, Intel, and Qualcomm Technologies and announced the Microsoft Pluton security processor. The Pluton design builds security directly into the CPU.
To overcome the challenges and realise the opportunities presented by semiconductor densities and capabilities, electronic product cloud companies will look into System-on-a-Chip (SoC) design methodologies of incorporating pre-designed components, also called SoC Intellectual Property (SoC-IP), which can then be integrated into their own algorithms. As SoCs incorporate processors that allow customisation in the layers of software as well as in the hardware around the processors is the reason why even Google Cloud is bullish on this. They even roped in Intel veteran Uri Frank to lead their server chip design efforts. According to Amin Vahdata, VP, GCP, SoCs offer many orders of magnitude better performance with greatly reduced power and cost compared to assembling individual ASICs on a motherboard. “The future of cloud infrastructure is bright, and it’s changing fast,” said Vahdat.