Just a thought experiment, I very much doubt I'm the first one to think of it. It's probably in the same line of "why doesn't an LLM just write assembly directly"
I liken it to the problem of applying machine learning to hard video games (e.g. Starcraft). When trained to mimic human strategies, it can be extremely effective, but machine learning will not discover broadly effective strategies on a reasonable timescale.
If you convert "human strategies" to "human theory, programming languages, and design patterns", perhaps the point will be clear.
But: could the ouroboric cycle of LLM use decay the common strategies and design patterns we use into inexplicable blobs of assembly? Can LLMs improve at programming if humans do not advance the theory or invent new languages, patterns, etc?
The current training loop for coding is RL as well - so a departure from human coding patterns is not unexpected (even if departure from human coding structure is unexpected, as that would require development of a new coding language).
My suspicion is that the "language" part of LLMs means they tend to prefer languages which are closer to human languages than assembly and benefit from much of the same abstractions and tooling (hence the recent acquisition of bun and astral).
> It's a shame, because it's still the best coding agent, in my experience.
If it is the best, and if it delivers the value users are asking for, then why would they have an incentive to make further $$$ investments to make it of a "higher" quality if the value this difference could make is not substantial or hurts the ROI?
On many projects I found this "higher quality" not only to be false of delivering more substantial value but actually I found it was hurting the project to deliver the value that matters.
Maybe we are after all entering the era of SWE where all this bike-shedding is gone and only type of engineers who will be able to survive in it will be the ones who are capable of delivering the actual value (IME very few per project).
Or that's why tgey had to buy bun with actual engineers to work on Claude Code to reduce memory peaks from 68 GB (yes, 68 gigabytes) to a "measely" 1.7? Because code quality doesn't matter?
Or that a year later they still cannot figure out how to render anything in the terminal without flickering?
The only reason people use Claude Code is because it's the only way to use Anthropic's heavily subsidized subscription. You get banned if you use it through other, better, tools.
Now whether that’s actually possible is a second topic.