upvote
With MoE models like Deepseek’s and with multiple Crescent Island accelerators, the aggregate memory throughput actually doesn’t look that bad. Two Crescent Island gets roughly 1400GB/s and Deepseek-v4-flash with 13B parameters active nets roughly 100t/s which is decent for a small team or great for a single user.

More Crescent Island scale up, although not likely entirely linearly.

But all GPU inference work like this, it’s not specific to Intel. Just Intel promises more affordable cards with big memory so they’re attractive.

reply