In this article we are going to talk about which model is the best at the moment to move video games, if not how said CPU should be. This is not to say that we know more than those who design the processors. All this is so that you understand that there are a series of decisions that end up in the design of PC processors, which do not make them perfect for all tasks. Although we cannot deny that it is not known for what use each of those that leave the factory will go and it would be a complication to have a chip for a specific design.
What would the ideal processor for gaming be like?
First of all we have to clarify that we are not talking about a CPU that does not work if we are editing a Word document, but about one that can work in any case, but that takes a different approach compared to conventional designs. So we have to start by discarding a series of elements on the input table that will not be in our ideal processor for gaming.
Full cores and no multithreading
Multithreading does not double the power of the processor, but rather doubles the parts for fetching and decoding instructions so that in periods of stoppage, what we call an execution bubble, said core can get closer to 100% performance. However, this has a problem, it limits the maximum clock speed that can be achieved on Boost by making the cores larger. The reason is that the voltage that a processor can reach is inversely proportional to its area.
The other issue is communication, although they are not twice as complete, they do have internal communication like these. At certain levels, rings are used to communicate that the data rotates in one segment per clock cycle, the problem is that the more clients there are, the more latency there is between them. That is to say, if today’s games need 16 execution threads in 8 cores, what better way to give them 16 full cores with the ability to reach as many GHz as possible.
No integrated graphics card in our ideal processor for gaming
Nobody plays with an integrated graphics card in the PC and now there is an obsession to give them support tasks like video encoding in streams. For those who want to play a game with the highest fidelity, that piece is taking up space on the chip and generating heat, it is a useless element that does not add anything other than annoying. Our brand new GeForce from NVIDIA, ARC from Intel or Radeon from AMD are already good enough for all kinds of work. In addition, we do not want manufacturers to tie us down in a draconian and monopoly way to hardware ecosystems that they will abandon in a couple of years and that limit the way in which we are going to configure our PC.
DMA units directly connected to the graphics card
DMA units allow the GPU of the graphics card to access the main RAM to get the command list for the next frame. A more advanced version is the ReBAR that performs the reverse step. That is, it is the CPU writing to the memory of the graphics card using the direct memory access mechanism of the graphics card.
The idea we are talking about goes much further, since the PCI Express controller is inside the CPU today, the goal would be to build the lists for the GPU in the last level cache of the CPU and send them directly through a dedicated channel to the GPU VRAM. Obviously, we are not talking about graphic assets, just the list of instructions to do.
What would we gain with it? Well, several key milliseconds, since we would save the entire process of going to the cache to send the data to the graph looking for the memory address that we want to write. Obviously, said DMA unit will have the ability to convert memory addresses to the GPU’s own space.
video game instructions
In a general purpose processor, capable of moving any type of instruction, these are built from other simpler ones. But what happens when a complex operation is used continuously? Well, it requires many more cycles than an implementation of that instruction from scratch.
The idea is not to add new support units, but instructions that are not important in other fields, here they are treated as first-class citizens. Either by requiring fewer instruction cycles, avoiding resource access contentions between them, or failing that, creating new mechanisms for the same command, but making it work up to 10 times faster. What would this translate into? Well, obviously since a video game is a program and it is made up of instructions, it should run much faster.