Is NVIDIA working on its own "Infinity Cache" for its GPUs?

The next generation of graphics cards from NVIDIA, the RTX 40 Series, will arrive if all goes well later this year under the embrace of the new architecture Ada Lovelace. At this point, a lot of information has already been leaked about it, and an interesting fact is that this generation of GPU will have a large number of L2 memory which, inevitably, makes us think that NVIDIA could be working on a kind of Infinite Cache of its own.

According to recently leaked data, the NVIDIA RTX 40 Series will have up to 96MB of L2 cache, which is 90 MB more than graphics based on the Ampere architecture (however, it is speculated that this data could be wrong and that, in reality, the amount of cache memory would amount to 192 MB in total). This data confirms that the Ada Lovelace graphics cards would have 16 MB of cache per 64 bit of memory bus, significantly more than the 512 KB per 32 bit of bus that Ampere has, raising memory performance significantly.

NVIDIA’s “Infinite Cache”, myth or reality?

According to the leaked information, the flagship GPU of this generation would be codenamed AD102, and would have 96 MB of cache memory in total, while the lower GPUs, AD103 and AD104, would have 64 MB each, AD106 would have 48 MB and AD107 would have 32 MB. Of course, to incorporate such a large amount of cache it would be necessary for NVIDIA to use a large die, and according to the latest leaks it would be so since it is estimated that it would have 600mm²more or less like the GA102 chip.

On the other hand, and if all these leaks are true, the chips mounted on the next NVIDIA RTX 40 Series would have up to a 71% more cores than the current RTX 30 Series (although later you know that the manufacturer will cut the figures for the sake of greater stability and to launch chips of lower categories on the market). Naturally, this greater number of cores and memory will have a notable impact on consumption, and although there has already been a lot of speculation about it, it will be necessary to see how NVIDIA solves this “little problem”.

But back to the topic at hand, would it be possible to see a kind of Infinity Cache house brand in NVIDIA GPUs? Given the large amount of L2 memory that the upcoming RTX 40 Series seems to carry, there is a good chance that it will. Already in the Maxwell architecture GPUs, NVIDIA introduced the technology Tile Cachingwhich essentially connected the ROPS and rasterizing units directly to the L2 cache to maximize its performance, reducing latency in communication between both elements.

With Tile Caching, NVIDIA took a giant leap in terms of performance, since from that moment the ROPS no longer needed to write to the VRAM directly, doing so directly to the cache that was directly connected and which is already faster than VRAM. With this change, NVIDIA was able to drastically reduce the energy impact on the memory bus while significantly improving performance.

Thus, and although we are logically speculating and there is absolutely no information about it, we believe that all the information collected so far indicates that NVIDIA could be working on creating its own Infinite Cache for the next generation of graphics cards that, in theory, They will hit the market at the end of this year.

However, and as we always say, we will have to wait for the official details when NVIDIA provides them to clear up doubts about it, since of course we could be wrong and that the RTX 40 Series simply have a large amount of L2 memory, but nothing special to highlight.