upvote
Calling it sota might be a bit provocative, but what actually is the "state of the art"? We have benchmarks, but those are getting increasingly gamed and don't necessarily reflect the actual performance of a model, see Opus 4.7. So I think it's useful to have real world data from actual users as an additional data point.
reply
Maybe you shouldn't be relying on something if you can't even tell how good it is?
reply
That's pretty much exactly what the title says.

The technical abilities and usage are derived from the commenters usage reflections.

reply
and assuming all mentions are coding model mentions just because its on hn
reply