undefined

upvote

points

by lmf4lol20 hours ago |

upvote

by arcticfox18 hours ago|

[-]

There was practically no responsibility taken by the author, all blame on others. It was kind of shocking to read.

Anyone using these tools should absolutely know these risks and either accept or reject them. If they aren't competent or experienced enough to know the risks, that's on them too.

reply

upvote

by throwaway04120718 hours ago|

[-]

And it doesn't even have to do with these tools in the end, this is a disaster recovery issue at its root. If you are a revenue generating business and using any provider other than AWS or GCP and you don't have an off prem/multi-cloud replica/daily backup of your database and object store, you should be working on that yesterday. Even if you are on one of the major cloud providers and trust regional availability, you should still have that unless it's just cost-prohibitive because of the size of the data.

reply

upvote

by pixl9717 hours ago|

[-]

Like, shouldn't they teach the 3 2 1 rule of backups in school by now?

reply

upvote

by gigatree17 hours ago|

[-]

The point of the post was to warn other people building with agents, especially using Cursor or Railway, not a public reflection

reply

upvote

by dymk15 hours ago|

[-]

It was also to put Cursor and Railway on blast and complain about how they should have safeguarded him from putting a gun to his database and pulling the trigger.

reply

upvote

by simonjgreen16 hours ago|

[-]

Perhaps they should include a warning about learning systems design and architecture too then? It’s very incomplete.

reply

upvote

by ranguna5 hours ago|

[-]

I get what your saying, but this is resonating with me and making me feel for the author:

Cursor: we have top notch safeguards for destructive operations, you have our guarantee, we are the best

Author: uses their tools expecting their guarantees to be true (I would expect them to have a confirmation before destructive operation outside their prompt, as a coded system guardrail)

Cursor AI: Does destructive operation without asking

Author: feels betrayed.

So yeah, I think the author is right because they trusted Cursor to have better system guardrails, they didn't (agents shouldn't be able to delete a volume without having a meta-guardrail outside the prompt). Now the author knows and so do we: even if companies say they have good guardrails, never trust them. If it's not your code, you have no guarantees.

reply

upvote

by postexitus4 hours ago|

[-]

Sorry - still author's fault. They didn't understand how LLM's work. They thought Cursor implemented some magic "I control every action LLM takes" thing. It's impossible.

reply

upvote

by laszlojamf3 hours ago|

[-]

right. But cursor _said_ they had some magic. At some point you have to trust vendors. I don't know exactly how AWS guarantees eleven nines of durability on S3. But I sure hope that they do.

reply

upvote

by PUSH_AX1 hours ago|

[-]

yeah and when you interview the junior dev who also convinces you they're smart and have something special, they also delete prod and guess what... not that devs fault.

reply

upvote

by fg1373 hours ago|

[-]

I mean, AWS doesn't really "guarantee" anything, they just say if they can't meet the bar they'll refund you in credits which is equivalent to money.

reply

upvote

by shiandow14 hours ago|

[-]

For a company that puts DO NOT FUCKING GUESS in their instructions they made a heck of a lot of assumptions

- assume tokens are scoped (despite this apparently not even being an existing feature?)

- assume an LLM didn't have access

- assume an LLM wouldn't do something destructive given the power

- assume backups were stored somewhere else (to anyone reading, if you don't know where they are, you're making the same assumption)

Also you should never give LLMs instructions that rely on metacognition. You can tell them not to guess but they have no internal monologue, they cannot know anything. They also cannot plan to do something destructive so telling then to ask first is pointless. A text completion will only have the information that they are writing something destructive afterwards.

reply

upvote

by gwerbin12 hours ago|

[-]

The thing that seems to bring up these extremely unlikely destructive token sequences and it totally seems to be letting agents just run for a long time. I wonder if some kind of weird subliminal chaos signal develops in the context when the AI repeatedly consumes its own output.

Personally I don't even let my agent run a single shell command without asking for approval. That's partly because I haven't set up a sandbox yet, but even with a sandbox there is a huge "hazard surface" to be mindful of.

I wonder if AI agent harnesses should have some kind of built-in safety measure where instead of simply compacting context and proceeding, they actually shut down the agent and restart it.

That said I also think even the most advanced agents generate code that I would never want to base a business on, so the whole thing seems ridiculous to me. This article has the same energy as losing money on NFTs.

reply

upvote

by mike_hearn5 hours ago|

[-]

I don't think it's that. It's really all about context. Humans always have at least a bit of context so it's hard for us to imagine what it's like to have none at all. But the AI genuinely has none. And it's under (training) pressure to get the task done quickly, be a yes man, and so on.

Humans do make mistakes like these. I'm not sure where the fault really lies here. I can imagine a human under time pressure making the same error. It's maybe a goof in the safety design of railway. It shouldn't be possible to delete all your backups with a single API call using a normal token.

reply

upvote

by coalstartprob14 hours ago|

[-]

[dead]

reply

upvote

by gwerbin13 hours ago|

[-]

The author definitely deserves a lot of blame here and clearly doesn't understand AI well enough to have a coherent opinion on AI safety.

But Railway bears some responsibility too because, at least of the author is to be believed, it looks like they provide no safety tools for users, regardless of whether they use AI or not. You should be able to generate scoped API tokens. That's just good practice. A human isn't likely to have made this particular mistake, but it doesn't seem out of the question either.

reply

upvote

by dpark9 hours ago|

[-]

> You should be able to generate scoped API tokens. That's just good practice.

Fully agree, but given the rest of this story I don’t imagine the author would have scoped them unless Railway literally forced him to.

> A human isn't likely to have made this particular mistake, but it doesn't seem out of the question either.

The AI agent was deleting the volume used in the staging environment. It happened to also be the volume used in the production environment. 100% a human could have made this mistake.

reply

upvote

by manas9618 hours ago|

[-]

200% agree. If you decide to use this power you must accept the tiny risk and huge consequences of it going wrong. The article seems like it was written by AI, and quoting the agent's "confession" as some sort of gotcha just demonstrates the author does not really understand how it works...

reply

upvote

by infecto1 hours ago|

[-]

Embarrassing post by leadership. I was surprised how quickly they immediately jumped into Railway and Cursor failures. I like living on the edge but I would never give an agent access to the prod DB.

reply

upvote

by annoyingcyclist16 hours ago|

[-]

I kept reading and reading to find the part where the author took responsibility for any part of this, then I got to the end.

reply

upvote

by computerdork17 hours ago|

[-]

I don’t know, software systems complicated, it’s pretty much impossible for one person to know every line of code and every system (especially the CEO or CTO). Yeah, it was probably one or two employees set this all up realizing the possibility of bad Cursor and Railway interactions.

if you’re a software dev/engineer, if you haven’t made a mistake like this (maybe not at this scale though), you’ve probably haven’t been given enough responsibility, or are just incredibly lucky.

… although, agreed, they were on the cutting edge, which is more risky and not the best decision.

reply

upvote

by kokada16 hours ago|

[-]

There is a difference between making a mistake like this one and being humble (e.g., lessons learned, having a daily external backup of the database somewhere else, or maybe asking the agent to not run commands directly in production but write a script to be reviewed later, or anything similar) and just blaming the AI and the service provider and never admitting your mistake like this article is all about.

The fact that this seems to be written by AI makes it even more ironic.

reply

upvote

by anonymars9 hours ago|

[-]

Indeed. I swear reality gets stranger and more implausible by the day.

"That isn't backups. That's a snapshot stored in the same place as the original — which provides resilience against zero failure modes that actually matter (volume corruption, accidental deletion, malicious action, infrastructure failure, the exact scenario we just lived through)."

reply

upvote

by dpark8 hours ago|

[-]

> Yeah, it was probably one or two employees set this all up realizing the possibility of bad Cursor and Railway interactions.

I’ve got a hunch the only person is the CEO.

The domain was registered in October 2025. The site has kind of a weird mix of stuff and a bunch of broken functionality. I think it’s one guy vibe coding a ton of stuff who managed to blow away his database.

> if you’re a software dev/engineer, if you haven’t made a mistake like this (maybe not at this scale though), you’ve probably haven’t been given enough responsibility, or are just incredibly lucky.

Mistakes are understandable. Having no introspection or self criticism, not so much.

reply

upvote

by il-b6 hours ago|

[-]

If you can handle disaster& recovery, you shouldn’t be a CTO

reply

upvote

by meisel20 hours ago|

[-]

Yeah the author really should’ve taken some responsibility here. It’s true that the services they used have issues, but there’s plenty of blame to direct to themself

reply

upvote

by nzoschke17 hours ago|

[-]

And they decided to leave a token with destructive capabilities in the agents access, and decided to not have verified backups for their database.

My team practices "no blame" retros, that blame the tools and processes, not the individuals.

But the retro and remediations on this are all things the author needs to own, not Railway or Cursor.

- Revoke API tokens with excessive access

- Implement validated backup and restore procedures

- ...

reply

upvote

by sombragris15 hours ago|

[-]

The whole use of AI agents in this context reminds me of the movie "War Games"

  > A strange game.
  > The only winning move is
  > not to play.

reply

upvote

by reliablereason17 hours ago|

[-]

Right! Blaming an agent or anyone else is crazy. The author built a system that had the capability of deleing the prod database.

The system did delete the database cause the author built it like that.

reply

upvote

by Zopieux18 hours ago|

[-]

It's hilarious how much they can't take any accountability for running a random text generator in prod, and they could not even be bothered to write their own tweet.

I do not feel sorry, but I do feel some real schadenfreude.

reply

upvote

by angrydev15 hours ago|

[-]

I love boring tech. It's reliable as hell and not as full of hidden surprises. Screw the cutting edge for serious work.

reply

upvote

by 20 hours ago|

[-]

deleted

reply

upvote

by estetlinus18 hours ago|

[-]

100%

Trying to run a blame game is such a facepalm.

reply