The strategy with the RTX 40 that NVIDIA would follow would be related to its cost and could have changed compared to previous generations of its gaming GPUs, which would lead to certain performance cuts, especially in the top-of-the-range model with the AD102 chip. Why?
There is no doubt that the next great node for AMD and NVIDIA GPUs is the 5nm one from TSMC, however if we study the latest information we will see a message that is repeated in both cases and that has led both to make decisions both in RTX 40 and in RDNA 3 that have been influenced by a series of limitations in said manufacturing node, which has led to certain cuts that would not otherwise be understood.
One of the surprising things about all the rumors around RDNA3 is that the functionality of the GPU will be divided into several different chips, so it will not be a monolithic design, but by chiplets. It must be taken into account that the possibility of a processor having a fault increases as the area it occupies becomes larger and larger. Let’s not forget that the cost of each chip is that of the wafer divided by the number of good chips that come out, that is, those that can be used.
However, in the case of NVIDIA, this could respond to a strategy with the RTX 40 rather than due to technical limitations of the manufacturing process.
NVIDIA’s strategy with the RTX 40: smaller chips, lower cost
Jensen cost down some small features like NVLINK or extra dp links to reduce pin usage to add gpu power.
— kopite7kimi (@kopite7kimi) March 2, 2022
It should be clarified that there is a very large difference between the limit of the lattice, which is the size of the largest chip that can be physically manufactured, and the yield limit. And it seems that in the case of 5 nm these are beyond 600 mmtwo they give trouble. On what basis do we say this? Well, in Kopite’s Tweet about the possible size of NVIDIA’s RTX 40 GPUs.
It is clear that a 384-bit bus for a GPU of 144SM is insufficient, even with the rumored large amounts of L2 cache, 96MB. Since a larger capacity is needed to reduce VRAM accesses as much as possible. So the fact that the RTX 40 have fewer bits of bandwidth with the VRAM than the RTX 30 and the NVLink no longer exists would respond to the fact that NVIDIA would have been forced to cut the total area by reducing its perimeter, possibly due to the cost wafers from TSMC.
The other possibility is that it is a premeditated strategy by Jensen Huang and it would not be the first time he has executed it.
The RTX Lovelace are the new Pascal
The situation with the RTX 40, if those of Jensen Huang are not making the partridge dizzy, is similar to that of the GTX 10. If we remember we can remember how in the gaming version these were an evolution of their previous generation, but adapted to a new node and adding more units. The result? Graphics cards with high performance and GPU sizes small enough to overwhelm AMD. Let’s not forget for example how the chip of the GTX 1080 measured almost 150 mmtwo less than that of the RX Vega.
In other words, NVIDIA’s strategy with the RTX 40 may not be due to problems with the 5nm node, but responds to the objective of supplying a gaming market eager for graphics cards but not at any price. There is no doubt that the chiplet implementation of the RX 7000 will be expensive to manufacture as was the use of HBM2 memory in the RX Vega in its day.
The weapon to beat AMD is still Tensor Cores and DLSS, it doesn’t seem that Lisa Su’s will have a similar response, which will require a more powerful GPU to achieve certain frame rate performance. We don’t know how this move can work out for Jensen Huang.