Hacker News
new
past
comments
ask
show
jobs
points
by
ntonozzi
6 hours ago
|
comments
by
lukebechtel
6 hours ago
|
[-]
Yes, speculative decoding will make both us
and
VLLM faster, but we believe it would be a relatively even bump on both sides, so we didn't include it in this comparison. Worth another test!
reply