If you look at the slides of the different processor architectures, and regardless of its brand, you will see that a type of cache appears entitled TLB cache. Well, this type of cache, like the conventional one, is related to the access of the CPU to the RAM, but unlike the conventional one, it does not have to do with access to data, but rather to the search and location of the themselves.
TLB, Translate Lookaside Buffer
When a processor needs to access the RAM of the system, what it does is generate a virtual address that indicates the position of the data in memory from the point of view of the CPU, which does not have a match with the physical memory address of the system. .
The unit in charge of making the transfer from one type of address to another is the MMU and initially what these units do is save in a part of the RAM that they accessed directly to make the transfer of virtual addresses. Of course, we have to take into account that a virtual address is made up of three components:
- Fixed-size page that stores a few kilobytes of memory.
- Page table is obviously a table that stores different pages.
- Directory of pages, which includes all the Page Tables.
Thus, when the MMU performs the conversion from virtual to physical addresses, what it does is consult these three sections, the Page Directory being in the storage unit and therefore further away from the processor, but it is not consulted, since the Page Table it is copied into RAM. Well, what the TLB cache stores is the address transfer corresponding to the current page on which the CPU is looking at all times.
The reason why the memory address is first searched is due to the fact that in the event that the TLB cache does not have the address, the Page Table in memory will be asked for it. In other words, a processor does not first ask about a specific instruction or data, but its location. The other reason has to do with the fact that RAM is always searched for where the information is, not the information itself.
The working process of the TLB cache
In a very simplified way, it happens as follows:
- The CPU makes a request to a memory address.
- The MMU using the TLB cache generates a physical address.
- The first level of the cache is asked if the data of that physical address of the RAM exists inside it, if the data is found then it is marked as a “hit” and the data is modified or returned to the CPU as necessary.
- In the event that the address sought is not found in the TLB then the Page Table is accessed in RAM and the next page is copied into the TLB to see if the address is there.
As with caches and therefore to speed up access to data we can find several levels of TLB, for example and in some advanced CPU designs there is a TLB for each level of the cache hierarchy. In the case of other designs, what they do is have a TLB for data and another for instructions, and the more complex ones combine both concepts in the architecture.
The TLB is therefore a memory that stores the latest translations made by the MMU, in which it stores the virtual address on the one hand and the physical one on the other. Thanks to this, the MMU does not have to carry out the transfer again and it is as simple as consulting this small cache to know if that address is on the current page.
Virtual caches, TLB and multicore
The problem with the method that we have explained is that for each memory request that is made it is necessary to access the TLB cache and many times this is counterproductive since we increase the latency. The solution to it? The so-called virtual cache, which contrary to what its name says is not a nonexistent and therefore abstract cache, but is a cache where the information it contains is not organized according to its physical address but by its virtual address, therefore that the address transfer is executed after the search of the data in said cache and only if a “miss” occurs, which means that the data is not in the virtual cache.
The problem comes when two threads of execution of the CPU each have their own virtual address space, totally separated at the physical address level, but with the same virtual addresses. In such a way that if two applications want to use the same virtual address, what they will do is ask the virtual cache if there is such a virtual address inside it, even if the content being searched for is different. Which is fatal in a multitasking environment such as 100% of the cases nowadays in which, for security reasons, all programs have their own virtualized memory addressing.
The virtual cache is not common in all processors and involves a double check, which means that in the end it is necessary to make use of the TLB to carry out the transfer of the instructions by the MMU, eliminating its advantage. All of this completely eliminates the advantage of having a virtual cache, but most contemporary TLBs make use of some type of virtual cache inside from which they check before making the corresponding move, gaining performance in the process. In other words, most of the Trasnslate Lookaside Buffer that are in CPUs today integrate the operation of the virtual cache.