Computer

Upcoming AMD GPUs Will Be Faster By AI Processor

At the moment the only hardware implementation for artificial intelligence is the Matrix Core Units in the CDNA GPU for high-performance computing. In the case of current GPUs, RX 6000 with RDNA 2 architecture, these types of units are not found inside and there are no references of their implementation for RDNA 3.

Well, the answer could lie in a neural processor stacked on top of the GPU and would include both top-notch cache and a dedicated CPU inside.

AMD’s stacked neural processor

AMD Stacked Neural Processor

Under the title of Stacked Dies For Machine Learning Accelerators, AMD has published a patent that describes a memory chip with an integrated neural processor that would go on top of its GPUs. Which would be an evolution of the V-Cache used in its Ryzen CPUs, but it would be designed for its graphics processors and therefore to increase the capacity of the Infinity Cache. The particularity? The implementation of a neural processor for AI at the other end of the stack.

Actually a variant of this patent with configurations 2.5 DEC It was already published a few months ago, and it is that as in said patent we can see the stacked memory can be configured as cache or normal RAM, in order to allow its use by the neural processor. Let’s not forget that these types of processors require high bandwidth to work.

How is it that AMD is not going to integrate it into the Compute Units of its GPU? Well, we do not know it, however, something tells us that it is a way by the Radeon Technology Group to accelerate the implementation of this type of units in its GPUs as quickly as possible and without having to return the rest of the chip to the design table .

Differences from NVIDIA Tensor Cores

NVIDIA Tensor Cores are systolic arrays integrated in the shader units of the GPU and that share resources with the rest of the execution units such as registers, caches and access to these. Instead, AMD’s stacked neural processor would be an accelerator in addition to those used as a display driver, hardware video codec, and so on. So this means that AMD’s neural processor, unlike NVIDIA’s Tensor, will be able to work in parallel by not sharing the resources within the GPU.

In closing, we don’t know what the total power of AMD’s stacked neural processor will be compared to what we have on the market now. However, building a 3DIC chip is always expensive and let’s not forget that RDNA 3 will be AMD’s first gaming GPU based on chiplets, making its design too complicated.

Will we see its implementation in the next Gaming GPUs from AMD? Who knows, only time will tell us. Maybe for RDNA 4? It is more than likely looking at the publication date of the patent.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *