undefined

points

[-]

Maybe they have vibe-coded their own stack!

But less tongue-in-cheek, yeah Anthropic definitely has reliability issues. It might be part of trying to move fast to stay ahead of competitors.

by adastra227 hours ago|

parent|

[-]

They have. Claude Code was their internal dev tool, and it shows.

by CuriouslyC7 hours ago|

parent|

[-]

And yet even dogfooding their own product heavily, it's still a giant janky pile. The prompt work is solid, the focus on optimizing tools was a good insight, and the model makes a good agent, but the actual claude code software is pretty shameful to be the most viable product of a billion dollar company.

by shuckles5 hours ago|

parent|

[-]

What artifact are you evaluating to come to this conclusion? Is the implementation available?

by rmonvfer3 hours ago|

parent|

[-]

The source for one of the initial versions got leaked a while ago and let’s say it’s not very good architecturally speaking, specifically when compared with the Gemini CLI, which it open source.

The point of Claude Code is deep integration with the Claude models, not the actual CLI as a piece of software, which is quite buggy (it also has some great features, of course!)

At least for me, if I didn’t have to put in the work to modify the Gemini CLI to work reliably with Claude (or at least to get a similar performance), I wouldn’t use Claude Code CLI (and I say this while paying $200 per month to Anthropic because the models are very good)

by CuriouslyC5 hours ago|

parent|

prev|

[-]

A. I use it daily to take advantage of the plan inference discount.

B. Let's just say I didn't write the most robust javascript decompilation/deminification engine in existence solely as an academic exercise :)

by Analemma_8 hours ago|

parent|

prev|

[-]

The tongue-in-cheek jokes are kind of obvious, but even without the snark I think it is worth asking why the supposed 100x productivity boost from Claude Code I keep hearing about hasn't actually resulted in reliability improvements, even from developers who presumably have effectively-unlimited token budgets to spend on improving their stack.

by Uehreka7 hours ago|

parent|

[-]

I love how people like Simon Willison and Pete Steinberger spend all this effort trying to be skeptical of their own experiences and arrive at nuanced takes like “50% more productive, but that’s actually a pretty big deal, but the nature of the increase is complicated” and y’all just keep repeating the brainrotted “100x, juniors are cooked” quote you heard someone say on LinkedIn.

by CuriouslyC7 hours ago|

parent|

prev|

[-]

AI gives you what you ask for. If you don't understand your true problems, and you ask it to solve the wrong problems, it doesn't matter how much compute you burn, you're still gonna fail.

by cainxinth7 hours ago|

prev|

[-]

I've been paying for the $20/m plan from Anthropic, Google, and OpenAI for the past few months (to evaluate which one I want to keep and to have a backup for outages and overages).

Gemini never goes down, OpenAI used to go down once in a while but is much more stable now, and Anthropic almost never goes a full week without throwing an error message or suffering downtime. It's a shame because I generally prefer Claude to the others.

by panarky5 hours ago|

parent|

[-]

Same here, but for API access to the big three instead of their web/app products, and Gemini also shows greater uptime.

But even when the API is up, all three have quite high API failure rates, such as tool calls not responding with valid JSON, or API calls timing out after five minutes with no response.

Definitely need robust error handling and retries with exponential backoff because maybe one in twenty-five calls fails and then succeeds on retry.

by boarush4 hours ago|

parent|

[-]

Invalid JSON and other formatting issues is more towards the model behavior I would say since no model guarantees that level of conformance to the schema. I wouldn't necessarily club it with the downtime of the API.

by j454 hours ago|

parent|

prev|

[-]

A lot of people might be discovering their preference for Claude.

by RobertLong8 hours ago|

prev|

[-]

All the AI labs are but Anthropic is the worst. Anyone serious about running Claude in prod is using Bedrock or Vertex. We've been pretty happy with Vertex.

by boarush8 hours ago|

prev|

[-]

I wonder why they haven't invested a lot more in the inference stack? Is it really that different from Google, OpenAI and other open weight models?

by ihaveajob8 hours ago|

prev|

[-]

Have you used Bitbucket?

by boarush8 hours ago|

parent|

[-]

A core research library for MATLAB I used in a course project used to be on BitBucket, though thankfully didn't have to deal with a lot of collaboration there.

by paulddraper4 hours ago|

prev|

[-]

OpenAI used to be just as bad if not worse.

But they've stabilized the past 5 months.