Anyone who asks a PC hardware engineer what the best type of RAM is today is going to get the same answer: HBM memory is the best of all. And it is that it has a series of advantages among which stands out the smaller space it occupies thanks to being made up of stacked chips interconnected vertically, since it is also the one with the least energy enough, consuming half per transmitted bit than other memories such as memory. GDDR6 keeping bandwidths equal to the popular VRAM and with the icing on the cake of having less latency than none and being able to be used for it as a kind of last-level cache.
However, HBM memory, despite the fact that it has become standardized in markets such as high-performance computing, has not done so in the PC world for one reason: it is expensive to manufacture and requires that not only the memory but all the components that they are going to communicate with her and mount on an interposer. This has led to the development of technologies such as CoWoS from TSMC and EMIB from Intel to communicate HBM memory with the different processors that accompany it.
The great challenge is that the third generation HBM is not being developed as an evolution of the current standard, this forces a clean slate and that the designs have to be thought from the beginning in a type of memory that at the moment does not it has a finished standard on how it should work.
This is how HBM3 memory is evolving
Officially, the JEDEC standard about HBM3 memory does not exist and has not been published, but this does not mean that there are not different agents within the industry that are developing the new standard, such as RAMBUS or Cadence, companies specialized in creating designs to sell to third parties. And it is that common elements such as memory interfaces are not usually developed within companies such as Intel, AMD or NVIDIA, but are licensed under the guarantee of a standard, which at the moment does not happen with the third generation HBM.
And what things do we know about the HBM3? First of all, it is expected to reach 8.4 Gigabits of data transmitted per pin, which is more than twice the fastest HBM2 on the market at the moment with 3.6 Gbps. At the moment we do not know what method will be used to achieve such speeds without increasing energy consumption.
In the case of first and second generation HBM we have 8 128-bit channels, while the HBM3 now has 16 channels, but with 64 bits each and therefore also with a bus of 1024 bits per stack, for which the chips that will go on the stack will be smaller and higher. To clarify what these figures mean, we are talking about that each HBM3 memory stack can reach up to 1 TB / s of bandwidth, this is twice that of the VRAM of a RX 6900 XT and on a single chip.
Why is it seen as the memory of the future at HPC?
The reason why HBM3 memory is one of the building blocks for high-performance computing is none other than ideal for creating a fully consistent memory addressing environment. All this under more than excellent bandwidth for graphics and artificial intelligence and with low enough latency for the CPU.
This is the key to the HPC projects that Intel and AMD are having separately, both the AMD EHP for El Capitan and the construction of another supercomputer such as Aurora have CPU and GPU as a central element in an interposer that will intercommunicate them with memory. Next-generation HBM. We shouldn’t even rule out that NVIDIA’s Grace CPU and Hopper GPU duo also make use of the HBM3 as a cohesive element to reach new levels of performance.
Intel, for example, already plans to use the HBM2e for its future Sapphire Rapids, especially to speed up the AI-related algorithms that run using the AMX units within the new core. In the future this will be combined with the use of a GPU that will share the use of an HBM3 memory as a single memory well.