upvote
value is high but what about the competitors?

is claude that good? the last time i tried claude it was sonnet 4.5. it was ok, not worth the api money clearly. but i only use api tokens for llms.

reply
If you look at SWE, Claude models aren’t that special. Other benchmarks come up with different results.

But… anecdotally, Claude is just that good. Gemini needs a lot of hand-holding, and it will still tell you it’s done when it achieved half the work. Or say, “this test isn’t passing, I’ll just delete it”. Every now and then I get tired of it and give the same task to Sonnet 4.6; 5 minutes later I’m done. Bug fixed, UI properly working, React hooks not being conditionally rendered, theme variables used properly. It’s wonderful.

I’m not sure about large agentic work or deep thinking, but I’m mostly automating away the drudgery of dealing with React Native. I still want to do the deeper work myself, but even there Opus is usually a really good sparing partner.

reply
Matches my experience. I am not sure why, but subjectively it feels better.
reply
Were you using the Gemini model with the Claude Code harness? Otherwise, it is not an honest comparison.
reply