Hacker News
new
past
comments
ask
show
jobs
points
by
jumploops
19 hours ago
|
comments
by
beering
19 hours ago
|
next
[-]
With Opus it’s hard to tell what was due to the tokenizer changes. Maybe using more tokens for the same prompt means the model effectively thinks more?
reply
by
conradkay
19 hours ago
|
prev
|
[-]
They say latency is the same as 5.4 and 5.5 is served on GB200 NVL72, so I assume 5.4 was served on hopper.
reply