Hacker News
new
past
comments
ask
show
jobs
points
by
deepdarkforest
225 days ago
|
comments
by
oceanplexian
225 days ago
|
[-]
Except coding, where it’s essentially middle of the pack. Which is the only thing that you can build objective benchmarks around. The fact that people on LM arena prefer the output has no relationship to how intelligent the model actually is.
reply