Chip designer Arm recently introduced its latest architecture Armv9, at the Arm Vision Day 2021 event. Arm’s last architecture announcement came in 2011, at the Armv8 launch. Armv9 offers improvements in performance, speed, and security.
“As we look toward a future that will be defined by AI, we must lay a foundation of leading-edge compute that will be ready to address the unique challenges to come. Armv9 is the answer. It will be at the forefront of the next 300 billion Arm-based chips driven by the demand for pervasive specialized, secure and powerful processing built on the economics, design freedom and accessibility of general-purpose compute,” said Simon Segars, chief executive officer, Arm.
Top features include:
Confidential Compute Architecture
The most significant feature of Armv9 is the Arm Confidential Compute Architecture (CCA). It introduces the concept of dynamically created Realms that can be used by all applications where the data is concealed from the operating system and other apps on the device.
Unlike traditional security models that allow privileged software to see lower-tier applications’ execution, Realms can help shield sensitive data and code from the rest of the device and privileged software by performing computations in a hardware-based secure environment. Think of them as secured containerised execution environments entirely opaque for the OS or hypervisor. Instead of hypervisor, Realms are managed by a Realm manager. These are roughly one-tenth the size of a hypervisor.
Realms could significantly reduce the chain of trust of a given application running on a device since the OS is now transparent to the security issues. Mission-critical applications that require supervisory controls will be able to run on any device. This is highly beneficial and can potentially phase out the need for businesses to use only dedicated devices with authorised software stacks.
Scalable Vector Extension version two (SVE2)
Armv9 continues to use the AArch64, which was introduced in v8. In v9, the AArch64 is used as the baseline instruction set. The most important addition to the AArch64 architecture is Scalable Vector Extension version two (SVE2).
The first implementation, Scalable Vector Extensions (SVE), was first introduced in 2016. It was developed following the Neon architecture extension, which has a fixed 128-bit vector length for the instruction set.
Used first in Fujitsu’s A64FX CPU cores, SVE currently powers the Fukagu supercomputer in Japan, touted as the world’s fastest supercomputer. It is particularly suited for High-Performance Computing (HPC) applications that require large amounts of data processing.
SVE2 is a superset of SVE and Neon and inherits the concept, vector registers, and operation principles of SVE. It permits more function domains in data-level parallelism. SVE and SVE2 define 32 scalable vector registers, and users can choose a suitable vector length design implementation for hardware that varies between 128 and 2048 bits.
SVE2 differs from SVE due to the functional coverage of the instruction set. While SVE was designed for HPC and ML, SVE2 goes beyond and extends the SVE instruction set to enable data-processing domains beyond HPC and ML. The SVE2 instruction set can enhance standard algorithms used in applications such as computer vision, multimedia, genomics, in-memory database, web serving, long-term evolution (LTE) baseband processing and general-purpose software.
SVE2 also adds a vector-width-agnostic version of the Neon instructions in most integer Digital Signal Processing (DSP) and media processing functionality. This helps compilers to vectorise more effectively for these domains.
Total Compute Design Methodology
Arm has boosted the CPU performance at a rate much faster than than the industry average. The ARM spokesperson said the momentum would carry over to the Armv9 generation. Users can expect a CPU performance increase of 30 percent over the next two generations of mobile and infrastructure CPUs.
Arm’s Total Compute design methodology is expected to improve the overall compute performance through system-level hardware and software optimisations.
The Total Compute design principles will be applied across the entire IP portfolio of automotive, client, IoT, and infrastructure. Further, Arm is also developing technologies to increase frequency, bandwidth, and cache size to maximise the performance of Armv9-based CPU.
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
Join Our Telegram Group. Be part of an engaging online community. Join Here.
I am a journalist with a postgraduate degree in computer network engineering. When not reading or writing, one can find me doodling away to my heart’s content.