upvote
While I’m not disagreeing, if you ask the LLM to critique something, it will try very hard to find something to critique, regardless of how little it might be warranted. The important thing is that you have to remain the competent judge of its output.
reply
One of the best uses of AI I've found is code reviewing stuff I've written either entirely myself, or even code generated in a previous session.
reply
Yes or boiler plate! I usually go in and tweak it anyways because it's not good. But it does help. This agentic coding thing is madness to me.

I switched over to small local models. I do not need the vibe coder expensive models at all

reply
But those giant models get the boilerplate correct the first try! You're totally right though. My favorite thing to do these days is to hand craft the code in the middle of the app, then tell AI to make me a rest endpoint and a test. I do the fun/important part. :D

Though, that's coming from someone who can't justify thousands on personal hardware and is instead paying $20/month to Openai. Might as well use the best.

reply
I hear you in the local model upfront cost. I lucked out and I like to play video games and took my GPU a little to seriously. Buyers remorse is now gone I guess.

You can get pretty good results with even smaller models. Cant prompt and pray with them as much though. So I get it.

Deepseek is like pennies. I might sign up with them one day

reply
There is always a chance that the LLM will hallucinate something wrong. It's all probabilities, quite possibly the closest thing to quantum mechanics in action that we have at the macro level. The act of receiving information from an LLM collapses its state, which was heretofore unknown.

However, your actions can certainly influence those probabilities.

> If asked properly, LLMs can be used to poke holes in an existing reasoning or come up with new ideas or things to explore.

Since, at the most basic level, LLMs are prediction engines, and since one of the things they really, really want (OK, they don't "want", but one of the things they are primed to do) is to respond with what they have predicted you want to see.

Embedding assertions in your prompt is either the worst thing you can do, or the best thing you can do, depending on the assertions. The engine will typically work really hard to generate a response that makes your assertion true.

This is one reason why lawyers keep getting dinged by judges for citations made up from whole cloth. "Find citations that show X" is a command with an embedded assertion. Not knowing any better, the LLM believes (to the extent such a thing is possible) that the assertion you made is true, and attempts to comply, making up shit as it goes if necessary.

reply
> never ask a model for confirmation or encouragement; but you can absolutely ask it to critique something, and that's often of value.

What's the difference? The end result is equally unreliable.

In either case, the value is determined by a human domain expert who can judge whether the output is correct or not, in the right direction or not, if it's worth iterating upon or if it's going to be a giant waste of time, and so on. And the human must remain vigilant at every step of the way, since the tool can quickly derail.

People who are using these tools entirely autonomously, and give them access to sensitive data and services, scare the shit out of me. Not because the tool can wipe their database or whatnot, but because this behavior is being popularized, normalized, and even celebrated. It's only a matter of time until some moron lets it loose on highly critical systems and infrastructure, and we read something far worse than an angry tweet.

reply