DeepSeek V4 Pro is way more effective at batching multiple tasks together since the KV cache is so much lighter - a max of ~10GB at full 1M context, and in a linear proportion with context according to the DeepSeek V4 release paper. That's extremely impressive, it unlocks batching, agent swarms etc. even on severely memory-constrained platforms, especially at smaller max context.
reply