Every processor ever built contains an underlying “architecture” that represents profound properties that go beyond a single CPU core or physical design. This architecture defines how a processor works, what it can do, how memory is accessed, and much more. A change in the processor architecture marks an important milestone with completely new physical hardware designs, instruction sets and functions.
When it comes to smartphones, we’ve been using processors based on Arm’s Armv8 architecture and revisions for almost a decade. The introduction of Armv9 will soon be followed by brand new CPU cores for next-generation SoCs that will be integrated into future smartphones. With that crash course out of the way, let’s talk about Arm’s latest Armv9 architecture.
Continue reading: Arm vs x86: Instruction Sets, Architecture and Other Differences Explained
Armv9 is the first new Arm architecture in a decade and will define the next generation of mobile, server and other processors over the next 10 years. For starters, Arm boasts that the next two generations of CPU designs will improve by 30% over today’s highest performing Cortex-X1 CPU core. This does not include the clock speed and other manufacturing advantages that could help achieve even better performance. The other key takeaways are that Armv9 is much faster than Armv8 for machine learning workloads, and also much more secure to protect our most sensitive data.
Armv9: Faster machine learning for everyone
Arm keeps the exact inner function of Armv9 close to his chest for the time being. We’ll have to wait for the first processors based on the architecture to learn more. These will likely come out later in 2021. However, we do know quite a bit about the advanced machine learning and safety features that make up the bulk of the improvements in Armv9.
Let’s start with the math crunching improvements that result from Arm’s advanced matrix math features and second generation Scalable Vector Extension (SVE2). The first generation SVE was designed for the Fugaku supercomputer, but SVE2 was designed for general purpose computers. SVE2 builds on the principles of Arm’s NEON math library, but has been redesigned from the ground up to improve data parallelism. It is important that SVE2 also supports NEON, so it is used for DSP (Digital Signal Processing) functions.
Like SVE1, SVE2 enables flexible rather than fixed vector length implementations in 128-bit steps up to 2048 bits. This gives CPU designers greater control over the number processing functions of their CPU cores. It also supports new data types and instructions such as: B. bitwise permutation, complex integer-multiplication-addition with rotation, and other high-precision arithmetic bits for arithmetic and cryptography with large integers. SVE2 is also designed to accelerate popular algorithms used in computer vision, multimedia, LTE baseband processing, web services, and more.
SVE2 dramatically accelerates machine learning and other DSP workloads directly on the CPU and reduces the need for external DSP and AI processing hardware. The age of heterogeneous computing is certainly not over yet. Still, Arm sees these functions as so important to the future of computing that any CPU should be able to perform them efficiently.
Armv9: Improved hardware-based security
The importance of security in modern processors cannot be underestimated. I’m sure you all remember the excitement over exploits like Heartbleed, Specter, and the like. In order to avoid such memory loss and overflow problems and to avoid new ones in the future, new hardware-based security approaches are required. There are a few key elements in Armv9 – Memory Tagging Extension (MTE) and Realm Management Extension – as part of Arm’s Confidential Compute Architecture (CCA).
Tagged memory looks familiar to those who follow Android development closely, as this feature is already supported by Android 11 as well as OpenSUSE. Arm debuted memory tagging in Armv8.5, but there are no mobile CPU cores based on this revision. MTE was developed to prevent memory vulnerabilities with a “lock and key” approach to access. Memory pointers are marked when created and checked during load / save instructions to ensure that memory is being accessed from the correct location. Exceptions are thrown because of a mismatch so developers can track down potential security issues.
Running memory tagging in hardware on the CPU will reduce the performance degradation from this verification process. Likewise, hardware-based scans are much more tamper-proof, making it much more difficult for malicious actors to produce exploits.
Arm’s Realm Management Extension and CCA are even more comprehensive. It builds on the ideas of Arm TrustZone and enables applications to run in their own secure environment, isolated from the main operating system and other applications. In contrast to hypervisors and virtual machines, on which separate operating systems run side by side, Realms also supports the secure separation of individual apps and services that use a common operating system. You can think of this as Linux containers that are only more secure and built into the hardware.
The idea is simple enough. Each area cannot see what the other is doing, which greatly reduces the risk of sensitive data getting to another compromised app or even the operating system. As a result, your banking apps’ software and processing resources are securely separated from any game you are playing that is isolated from Facebook, etc. Hardware-based security features like these are becoming increasingly important in protecting sensitive data such as biometric information stored on our devices.
However, we will have to wait to learn more about how exactly Arm accomplishes this, what is available between the services, how the operating system shares resources, etc. We know that realms throughout the operating system require significant changes such as changing the operating system. B. Google’s Android. Therefore, realms with first generation Armv9 processors are not supported. The function is expected to appear a little later in the architecture life cycle.
The first Armv9 processors
Arm’s Armv9 architecture will find its way to Arm microcontrollers, real-time and application processors in the years to come. The first will fall under the Cortex-A line, which is intended for smartphone SoCs, followed by server chips. Arm expects our first Armv9 chipset for cellphones to be announced this year. The first devices will hit the market in 2022.
Hidden in Arm’s press conference, there was also a slide on the upcoming GPU features in Mali. These include variable rate shading and ray tracing, two features that are currently causing a sensation in the game console, and high-end graphics card markets. There is much to be expected from the broader Arm hardware portfolio in the years to come.
Next: What Nvidia’s purchase of Arm means for your next smartphone