Hacker News
new
past
comments
ask
show
jobs
points
by
ipdashc
4 hours ago
|
comments
by
wmf
3 hours ago
|
[-]
Using multiple chips seems to work fine for Cerebras and Groq so it should also work for Taalas. It does sounds challenging to reach >10K tok/s but latency could be below 1 us which is a small part of the token budget.
reply