upvote
For better or worse, a lot of people seem to disagree with this, and believe that humans reading code is only necessary at the margins, similarly to debugging compiler outputs. Personally I don't believe we're there yet (and may not get there for some time) but this is where comments like GP's come from: human legibility is a secondary or tertiary concern and it's fine to give it up if the code meets its requirements and can be maintained effectively by LLMs.
reply
I rarely see LLMs generate code that is less readable than the rest of the codebase it's been created for. I've seen humans who are short on time or economic incentive produce some truly unreadable code.

Of more concern to me is that when it's unleashed on the ephemera of coding (Jira tickets, bug reports, update logs) it generates so much noise you need another AI to summarize it for you.

reply
The main coding agent failure modes I've seen:

- Proliferation of utils/helpers when there are already ones defined in the codebase. Particularly a problem for larger codebases

- Tests with bad mocks and bail-outs due to missing things in the agent's runtime environment ("I see that X isn't available, let me just stub around that...")

- Overly defensive off-happy-path handling, returning null or the semantic "empty" response when the correct behavior is to throw an exception that will be properly handled somewhere up the call chain

- Locally optimal design choices with very little "thought" given to ownership or separation of concerns

All of these can pretty quickly turn into a maintainability problem if you aren't keeping a close eye on things. But broadly I agree that line-per-line frontier LLM code is generally better than what humans write and miles better than what a stressed-out human developer with a short deadline usually produces.

reply
Oh god, the bad mocks are the worst. Try adding instructions not to make mocks and it creates "placeholders", ask it to not create mocks or placeholders and it creates "stubs". Drives me mad...

To add to this list:

- Duplicate functions when you've asked for a slight change of functionality (eg. write_to_database and write_to_database_with_cache), never actually updating all the calls to the old function so you have a split codebase.

- On a similar vein, the backup code path of "else: do a stupid static default" instead of erroring, which would be much more helpful for debugging.

- Strong desires to follow architecture choices it was trained on, regardless of instruction. It might have been trained on some presumably high quality, large and enterprise-y codebases, but I'm just trying to write a short little throwaway program which doesn't need the complexity. KISS seems anathema to coding agents.

reply
And Sturgeon tells us 90% of people are wrong, so what can you do.
reply
Compiled natural language is meant for the machine, Written natural language is for other humans.
reply
If AI is the key to compiling natural language into machine code like so many claim, then the AI should output machine code directly.

But of course it doesn't do that becaude we can't trust it the way we do a traditional compiler. Someone has to validate its output, meaning it most certainly IS meant for humans. Maybe that will change someday, but we're not there yet.

reply