Hacker News
new
past
comments
ask
show
jobs
points
by
ipsod
2 hours ago
|
comments
by
wolttam
2 hours ago
|
next
[-]
2000 t/s prompt processing and 40-50 t/s generation. We should see 60-70 t/s generation with DSpark support solidifying in vLLM in a few days
Recent discussion on DSpark:
https://news.ycombinator.com/item?id=48696585
reply
by
2 hours ago
|
prev
|
[-]
deleted
reply