upvote
Yeah I’ve been consistently underwhelmed by anthropic models, but then I don’t use their harness so maybe that’s it
reply
In my experience, for more mechanical refactoring work (like splitting a big source code file into multiple smaller ones), GPT 5.5 runs way faster than any of the Claude models. But for other tasks that require deeper reasoning, it's not that clear who is the winner.
reply
It's just too funny to see people arguing about "no, it's my religion that's the right one!" on HackerNews.

You guys are all a lost cause.

reply
How is attempting to benchmark llms like religion?
reply
Re-read the comment I'm replying to, it's not talking about benchmarks, just models.
reply