upvote
That's not what consumes the most memory at scale. The KV caches are per-user.
reply