
The green giant has taken advantage of the HotChips scenario to offer new details about NVIDIA Grace, a super processor that uses the ARM architecture v9.0 and that it is prepared to offer a high level of performance, thanks to its 72-core configuration, although it is possible to join two processors to create a true 144-core “monster”, as we already told you at the time in this article.
NVIDIA Grace will be available in two variants, an independent one that will add two of these super processors, which will leave us with the 144-core configuration that we referred to, and another one that will be linked to a Hopper H100 GPU, the most advanced GPU NVIDIA has ever developed. Going back to the NVIDIA Grace specifications we find a total 117 MB of L3 cache, 68 PCIe Gen5 lanes and a design based on the TSMC 4N node.
With all this in mind, it is clear that the configuration based on two of those super processors would have a total of 234 MB of L3 cache and could work with up to 136 PCIe Gen5 lanes. It supports LPDDR5x memory with ECC (Error Correction) technology, achieves a bandwidth of 900 GB/s, and is capable of double the performance per watt one of the most advanced and powerful CPUs currently in existence, according to NVIDIA.
This new NVIDIA CPU supports SVE2 and various virtualization extensions such as Nested Virtualization and S-EL2, is compatible with the entire ecosystem of NVIDIA products and solutions, and uses the NVLINK C2C (chip-to-chip) connection to eliminate any possible bottleneck derived from the communication between chips. Just by looking at that 900 GB/s bandwidth that we referred to, the potential of that interconnection system is clear.
In total, an NVIDIA Grace CPU supports 512 GB of LPDDRX memory configured on 32 channels, which leaves us with a bandwidth of up to 546 GB/s. Those in green have confirmed that their decision to use this type of memory is based on a value criterion in relation to performance, cost and consumption. The two-chip, 144-core configuration has a TDP of 500 watts, and gets an estimated performance in SPECrate2017_int_base of 740which puts it above the EPYC 7742, which has a 64-core, 128-thread configuration.
According to NVIDIA, Grace is a highly specialized processor that is primarily focused on workloads focused on training next-generation NLP models, which have more than 1 billion parameters. In theory, when this CPU is combined with an NVIDIA GPU the resulting system will be able to offer Up to 10x more performance than the latest generation NVIDIA DGX-based systems with x86 CPUs.
It will be interesting to see how NVIDIA Grace positions itself when it hits the market, but from what we have seen, there is no doubt that it has a lot of potential, and that it is a highly competitive solutionso much so that it could be outlined as a solid option to other processors based on the x86 architecture.






