In terms of "junior dev following" it would be the model trying to think and write it as a Senior or Staff Level engineer would.
Humorously, this could be the result of LLMs vacuuming up all the sentiment on the web that the code that LLMs produce is trash-tier.
Maybe a more idealized training set could improve things, but at least for today’s SOTA, you have to get the shitty first draft out and then improve it.
Harnessing makes a difference, but it’s only shuffling around when and where the tokens get generated. It can trade being slower by doing a hidden first draft and only showing the output after doing a self review. But the models still need to generate it all explicitly.
I did an experiment on this a few weekends ago and Codex for example was a lot more adversarial and thorough in its review when given Claude-authored code compared to when given the same code with "I wrote this, can you review it?"
It's not the default, because the training data is full of unmaintainable code done wrong with mistakes. People literally complain that LLMs write too many tests or add comments.
If instead of "do it right", you give it specific actionable advice of how to right code, it does surprisingly well. Newer frontier models also do a great job of mimicking the style and rigor of the surrounding codebase without prompting, if you're working in an established codebase, for better or worse.
You never wrote quick exploratory code? One off scripts? How is the Ai suppsed to know unless you tell it.
If you tell another person to write some code, how are they suppsed to know? If you have your boss come to you and ask you to write some code to do some data analysis are you going to spend weeks writing units tests and perfect abstractions? Or do it quick and get the data and result?
For example, I built up a programming language from scratch with Claude, it knows nuances about my languages syntax, and can write code in my language effectively. I did it mostly as a test. It definitely helped that my language is heavily mostly Python based.