AMD Debuts New Chiplet Design For Scalability
Samuel Wan / 4 months ago
AMD Reveals Scalable Chiplet Design
One of the struggles of high core counts is communication between cores and outside the CPU. This problem is one that AMD has been grappling with since the launch of Zen. With up to 32 cores and now 64 core designs, the company has had to rethink their old MCM strategy. While Infinity Fabric has been very helpful, the way the MCM worked meant unequal access to memory and other I/O. To address this problem, the company is moving to a new Chiplet style design.
With MCM, the lopsided arrangement meant that some CPU modules were unable to all ways access I/O and memory directly. This is a problem with the huge core counts due to lack of efficiency. A band-aid solution was to balance the load to different cores based on the I/O load of the workload. This new Chiplet design resolves the problem by giving each MCM and core equal access and latency to the I/O. Each Chiplet is its own CCX, with a direct connection to the I/O die. Each CCX should still retain their own PCIe connections.
Chiplet Design Resolves I/O Latency Shortcomings
AMD is connecting the Chiplets to the I/O die using Infinity Fabric, and further to memory controllers and other Chiplets. This allows equal scalability for the chip for many MCM units and memory controller configurations. Due to the speed of Infinity Fabric, memory bandwidth should not be an issue. The only minor hiccups I foresee might for latency and certain multicore workloads with a slight efficiency hit. However, the overall impact should be positive and more flexible. It bypasses the problems with inter-MCM communication a 64 core chip on the old design would have caused
Finally, moving the I/O hub off the MCM means better scalability for CPUs. Often, the memory controllers and physical interfaces do not scale down well. By splitting it off, the complex core can use a smaller cheaper 7nm die while the I/O hub can use the mature 14nm process. This should improve binning and lower costs. The smaller dies for the Chiplet means it can move onto new process nodes earlier at the same cost and improve performance. It makes sense that AMD first started to move the memory controller onto the CPU die, and now is moving it off die but still on the chip. Overall, it looks like AMD has gone with a nice new approach, similar to other third-party proposals.