AMD will improve its new graphics cards by adding Tensor Cores

In recent years, the euphemisms artificial intelligence, Deep Learning and other magic words have been no other way of calling the addition to the hardware of a new type of unit capable of performing mathematical operations with matrices at high speed. Well, against all odds, AMD has hidden a detail in its graphics processors with RDNA architecture 3 that has to do with it. are the calls WMMA-instructions and they bring a very important change, hitherto unheard of in AMD gaming graphics cards.

Those of you who are more veterans will remember how in the 90s the word multimedia resonated as much as artificial intelligence today. The reason? The addition of SIMD units in the processors of yesteryear. Today the same thing is happening with the calculation of matrices and artificial intelligence. Well, AMD would have finally adopted these units in its domestic GPUs, after four years since NVIDIA will launch its first RTX with them.

A unit to operate with matrices

One of the spearheads of NVIDIA in its GPUs is the addition in the RTX of the so-called tensor cores, which are nothing more than a type of execution called a systolic array. Its main advantage? The ability to operate with data arrays at high speed. These types of mathematical structures are crucial for artificial intelligence algorithms and for being able to store large amounts of data. What is an image for example? An array of pixels.

Since a matrix is nothing more than a vector of N dimensions, up to now vector units or SIMDs have been used to carry out these operations, which can execute the same instruction in parallel with several different operands. However, although they can execute such instructions, they are much slower to do so than what we call a systolic array.

Well, with NVIDIA having their Tensor drives and Intel having added their XMX, which are just trade names for the same concept, the one that was missing was AMD. In any case, the absence has been rather in its GPUs for home computers and workstations. Since in the world of high performance computing or HPC, their CDNA-based AMD Instinct has been with these drives for a long time.

WMMA, Matrix Core Units for the RX 7000

The name that AMD gives to its systolic arrays is Matrix Core Units.which have been unreleased in RDNA and RDNA 2. Well, after several rumors in which it was said that AMD would not bet on this type of units in its GPUs, it turns out that in the LLVM repository there are references to the inclusion of WMMA instructions for GFX11which is what AMD calls the RDNA 3 architecture of its future GPUs like the RX 7000 and processors like Phoenix.

In the repository the acronyms WMMA are described as Wave Matrix Multiply Accumulate and are described as:

These operations perform matrix multiplication and addition in the form of: D = A*B+C.

Which reminds us of the structure defined by NVIDIA to describe its Tensor Cores. Moreover, as you can see, AMD’s rival company also used the WMMA concept. Furthermore, the adoption in AMD GPUs of this type of unit is revealed with a diagonal look at the source code.

The resulting matrix (D) given in 32-bit or 16-bit floating point and the 3 matrices operating in 16-bit floating point or BFLOAT16 format.
The final value of D being in 32-bit integers, and A, B, and C in 4- or 8-bit integers.

So in summary: we find that AMD has added the WMMA instructions in RDNA, which refer to the addition of a unit to operate with matrices, that is, finally AMD GPUs will have their own Tensor Cores. Which are highly used in the world of artificial intelligence and were one of the elements