upvote
Shorter PCB traces because of insane timing requirements for DDR5, GDDR7, and beyond; GPUs put the memory chips as close as possible surrounding the CPU die to reduce the latency and prevent timing/signaling issues.

But even there, the fastest AI accelerator GPUs are putting memory on die, and using chiplet designs, to get the memory closer and closer to the cores.

reply
Simply physically moving the RAM closer to compute can make communication faster.

Ideally, RAM and compute should be combined. That's kind of what our brains do. We'll probably need more mature memristor technology to achieve that one day.

reply