We knew that Apple was going to introduce a new chip, but in the end the apple company has exceeded our expectations with the Apple M1 Pro and Apple M1 Max SoCs, two silicones that maintain the ARM architecture at the CPU level and that, of course, raise the bar compared to the previous generation, and in every way, as we will see next. Both will be used in Apple’s new MacBook Pros.
The SoC Apple M1 Pro is manufactured in the process of 5 nm, and yes, as you may have imagined TSMC has been in charge of cooking it. Add a total of 33.7 billion transistors, it has a CPU of up to 10 cores divided into 8 high-performance cores and 2 high-efficiency cores, as we can clearly see in the diagram that Apple has shared.
The two high-efficiency cores add up 4 MB L2 cache, while the eight high-performance cores have 24MB L2 cache. The difference is remarkable, and the same is true of the supported instruction sets. We are facing an important jump compared to the Apple M1 SoC, since it has four high-performance cores and four high-efficiency cores.
The GPU adds 16 graphics cores, which are broken down into a total of 2,048 execution units, what we popularly know as shaders. Its gross power is 5.2 TFLOPs in FP32, a figure that fits with what we can expect from a mid-range graphics card like the GTX 1660, to take a well-known and easy-to-assimilate example. This does not mean that Apple’s GPU performs the same as NVIDIA’s solution in all fields, but it allows us to illustrate quite accurately the great leap that the apple giant has taken.
The Apple M1 Pro SoC maintains the design that we had seen in the Apple M1, which means that the apple company has shaped an SoC that is accompanied by a wide set of elements, including the I / O subsystem, the unit neural processing and memory subsystem. As we can see in the image, the SoC Apple M1 Pro can mount up to 32GB of unified memory with a high bandwidth. The memory is LPDDR5 type, uses a 256-bit bus, and being integrated right next to the SoC it is possible to greatly reduce latency. Raw, it can reach up to 200 GB / s of bandwidth.
Apple has doubled the bandwidth of the previous generation, but has also taken care of the rest of the elements that make up the Apple M1 Pro SoC, such as the multimedia engine, which accelerates H.264, HEVC, ProRes and ProRes RAW. The neural processing unit has 16 cores and can perform up to 11 trillion operations per second, while the Secure Enclave adds a significant layer of security. How could it be otherwise, it is compatible with Intel’s Thunderbolt 4 standard, and with the Apple M1 Pro SoC we can connect up to two external displays.
Apple M1 Max: Uncompromising Power
I think there is no better way to define this chip, a solution that maintains the base of the previous one, since also uses TSMC’s 5nm processBut it comes with major improvements that make a substantial difference to the Apple M1 Pro SoC, especially when it comes to GPU performance, as Apple has doubled the number of graphics cores.
The Apple M1 Max SoC adds 57,000 million transistors, It also uses a ten-core configuration divided into eight high-performance cores and two high-efficiency cores, and maintains the SoC-like approach with all elements integrated on one chip. The neural processing unit maintains the 16 cores, which translates into a power of up to 11 trillion operations per second, and has the Secure Enclave and Thunderbolt 4 support. It also retains the same multimedia engine.
However, the differences between the Apple M1 Pro SoC and the Apple M1 Max SoC are, as we anticipated, quite stark. The latter has twice the unified memory, that is, adds up to 64GB of LPDDR5 on a 512-bit bus, which allows doubling the bandwidth (400 GB / s), and presents a very contained latency. It also highlights the support of up to four external displays, and the presence of a much more powerful GPU.
On the GPU of the Apple M1 Max SoC, it is worth delving a little more into its keys, since it is a solution that adds 32 cores graphics, which is equivalent to 4,096 execution units (shaders). This leaves us with a power of 10.4 TFLOPs in FP32, the same that can reach, for example, a GeForce RTX 3050 Mobile. Apply here the same logic that we saw previously, and that is that this comparison should serve as an approximate reference to contextualize that raw performance, but not as an absolute equivalence.