News

AMD confirms a very curious error that affects EPYC Rome processors

The Sunnyvale company has recognized an error that affects AMD EPYC Rome processors and that, the truth, is so curious and so interesting I could not miss the opportunity to tell you. First of all, I remind you that this generation of CPUs is based on the Zen 2 architecture and, therefore, it has been on the market for a few years.

According to AMD, the error is summarized as follows: “kernels will not be able to exit the CC6 (idle) state after 1,043 days have passed since the last system reboot”. This means that one of the processor cores is unable to wake up from sleep when using an AMD EPYC Rome processor continuously for 1,044 days, which is roughly equivalent to 34 months.

In the official description given by AMD we can see that the problem occurs because the REFCLK CPU counts 10 ns ticks in a 54-bit signed integer, and if it counts a little more than 9 quadrillion of these ticks an overflow occurs in an approximate period of 1,043 days.

AMD

Once this overflow occurs, the nuclei get stuck forever in sleep mode and they will become “zombies” that will not accept any external interrupt requests that would make them exit such a mode. The only way to avoid this error is turn off or restart the system so that the counter is reset and everything returns to normal, or disable the CC6 state, but always before the failure occurs.

The really impressive thing about all this is that a computer based on AMD EPYC Rome CPUs has to be working without interruptions, that is, without being turned off or reset, for almost three years for this error to occur.

Given that we are talking about a processor for servers, this would be normal from the perspective of the importance of continuous use in this type of environment, but we must not forget that, in the end, you also have to stop to install important updates and security patches. security, and that these normally force a reboot of the system. AMD has confirmed that it does not plan to resolve this bug.

Related Articles