undefined

points

[-]

The percentage of times I prompt claude "what about checking if there are any child processes running?" or "Would using a lock here greatly simplify the design?" only to have myself be correct is approaching 100%. That is it isn't just claude sycophantically agreeing with me. The code itself becomes smaller, simpler, and more reliable with fewer bugs.

The agents tend to produce working code but the larger the scope the bigger the mess they tend to make. They will happily evolve toward a local maxima but leave world-destroying bugs lurking in the implementation.

The other issue is that claude regularly ignores explicit instructions in CLAUDE.md or in prompts. It will "helpfully" decide to just start doing whatever it wants or reinterpret instructions completely differently than it did the last 100 times.

It has nothing to do with losing control or trust. LLMs are not conscious. They have no executive function. They aren't even thinking. They're just models predicting the next word in the script. They are very useful tools but that's all they are: tools.

by lukan6 hours ago|

prev|

[-]

No. Because they still hallucinate at times. Confuse things. Forget things. Or none of the above, as it is anthropomorphizing, but the result is the same. They can make incredible working one shots, you start to trust them, then you trust too much and .. feel the result.

by alexwwang6 hours ago|

parent|

[-]

Yes. I am fighting with the disobeyance of LLM on working through my pipeline commands. I believe these violations are caused by its hallucinations. So I am still developing a mechanical system to monitor agents’ behaviors automatically. I believe these routines and monitors will play as a set of scaffold to keep leading the LLM on the right way all the time.