undefined

points

[-]

Didnt work for the prod data that the AI nukes in spite of prompts saying "DON'T FUCKING GUESS", just like that in all caps: https://news.ycombinator.com/item?id=47911524

What makes you think it will work for you?

by lukan14 hours ago|

parent|

[-]

That I don't let agents run wild in a production environment?

by ubertaco10 hours ago|

parent|

[-]

You let them write code that runs in prod, which is the same thing with extra steps.

Unless you review that code carefully, and then we're back to the point about it not saving you any cognitive overhead.

by lukan5 hours ago|

parent|

[-]

Of course it saves me overhead by not having to read all the necessary docs etc myself and just check the resulting code and not having to type all myself.

by enraged_camel2 hours ago|

parent|

prev|

[-]

>> You let them write code that runs in prod, which is the same thing with extra steps.

The “with extra steps” is doing a lot of work in that sentence.

by byzantinegene5 hours ago|

prev|

[-]

your spec is a guideline, not something the LLM has to adhere to. it is definitely not guaranteed to work without error

by lukan4 hours ago|

parent|

[-]

Are humans guaranteed to work without error?

by habinero11 hours ago|

prev|

[-]

> if I made a very clear spec - I can be almost sure

That "almost" is doing a lot of heavy lifting here. This is just "make no mistakes" "you're holding it wrong" magical thinking.

In every project, there is always a gap between what you think you want and what you actually need. Part of the build process is working that out. You can't write better specs to solve this, because you don't know what it is yet.

On top of that, you introduce a _second_ gap of pulling a lever and seeing if you get a sip of juice or an electric shock lol. You can't really spec your way out of that one, either, because you're using a non-deterministic process.

by lukan4 hours ago|

parent|

[-]

Well, unfortunately it is the same with real humans who happen to be non-deterministic as well. If I give them a task, I can be allmost sure, they will do it. But even humans can have unexpected psychotic breakdowns and do destructive stuff like deleting important databases.

So right now, humans are for sure more reliable. But it is changing. There are things I already trust a LLM more than a random or certain known humans.