upvote
You got it backwards; it's ~200 on single stream so the 2,600 is achieved with ~13 streams.
reply
Yeah that makes sense. I'm more familiar with seeing tok/s/user + TTFT rather than the total node throughput.
reply
hi yes it’s not optimized for single stream it’s optimized for total node throughput
reply
Oh, that's much better then. A good metric to share is the tokens per second per user for the node rather than the total throughput of the node. It disambiguates what's being optimized for much better than your blog post currently does.
reply
sounds good feedback taken, thanks beffjezos
reply